Computing Reviews

Complete guide to open source big data stack
Frampton M., Apress,New York, NY,2018. 365 pp.Type:Book
Date Reviewed: 09/06/18

Working with big data requires building up a network of services that allows for its effective use. In the scope of open-source systems, the big data stack is composed of open-source applications for understanding big data. The book’s title reveals its main purpose: namely, to be a guide for professionals who want to build an Apache Mesos-based big data stack. The text consists of ten chapters, each of which is dedicated to a separate element of the big data stack: frameworks, queuing, processing, storage, resource management, and visualization.

The first chapter introduces the concept of a big data stack as an entity that provides all big data requirements and functionality, plus some examples, documentation, and a support network. The author briefly mentions some development stacks such as MARQS and SMACK, but interested readers should look to other sources for detailed coverage of these. In further chapters, the author describes big data stack components: Apache CloudStack, Apache Brooklyn, Apache Mesos, storage, and processing. The chapters are presented as information technology (IT) projects that resemble the typical workflow of any IT manager. However, each chapter includes some experimentation in order to determine the limitations of the presented approaches. For example, in chapter 2, the author concludes that the created functionality can span over several data centers.

Indeed, the author works mostly in the realm of Apache-based systems (for example, Mesos and Spark), but the logic of the demonstrated approaches could be extended to other frameworks. As such, the detailed examples for cloud storage, release management, resource management, queuing, and so on can be used as templates for other non-Apache-based environments.

Because the book was written by a professional in the field, the text is very practical; however, it is not suitable for readers who want to understand the reasons behind a particular architecture or solution. Instead, the text is designed as a step-by-step manual to the usually obscured steps one must know in order to succeed in the practical realization of some project. The author investigates the development of a Mesos-based cluster, which could be of great help to “anyone who is interested in big data stacks based on Apache Mesos and Spark.” Readers are guided through the installation process of a private cloud using Apache CloudStack, and then guided through the complicated configuration steps. The text focuses on Apache Brooklyn, which is investigated as an installation tool for the Mule enterprise service bus (ESB), and Cassandra. The book also discusses the use of Apache Spark for big data processing.

Overall, this good practical text may benefit IT personnel in the big data industry. The book deals with the integration of open sources and shared detailed examples of cloud management, processing resource management, queuing, and data visualization. As such, it has a learning-by-example or note-sharing style.

More reviews about this item: Amazon

Reviewer:  Stefka Tzanova Review #: CR146234 (1812-0622)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy