
The recent convergence of high-performance computing (HPC), big data, cloud computing, and machine learning techniques has led to new processing, storage, communication, and data analytic methods for extracting information. Tremendous innovations have greatly impacted the scientific approaches that evolved with the data and its landscape. This survey-like book, written in a nontutorial and nonanalytical style, is mostly an encyclopedic monologue on big data literature.
The book covers a timely and interesting topic, that is, the “convergence trajectory” of the aforementioned technologies. Indeed, it is an appropriate time for working on this subject; many researchers could spend many years exploring the literature and contributions in the field. Thus, this work had the potential to be a seminal work in big data and HPC.
The book consists of 12 (somehow related) chapters. The first three chapters provide an introductory overview of parallel processing and storage frameworks and systems, mostly based on Hadoop and MapReduce in different software suits, with brief explanations. Chapter 4 on HPC is a general overview of multicore platforms, graphics processing unit (GPU) architecture, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), different applications of nonvolatile memory express (NVMe), and remote direct memory access (RDMA). These chapters include many code samples for the mentioned environments.
Chapter 5 looks at challenges to big data computing, and chapter 6 lists benchmarking environments for big data system performance. The next three chapters are on accelerations with RDMA (without sufficient explanation of its mechanism) and multicore and high-performance storage. A related literature review explains how the acceleration mechanism can be used to offload processing and storage burdens and to promote efficiency and performance. Chapter 10, a brief discussion of deep learning and big data, fails to provide the objectives, integrating mechanisms, and core architecture of the merging, yet propounds the probable effects of the system components. Chapter 11, which is supposedly on the mixing of cloud and HPC, focuses on different aspects of virtual environment and virtual machine mechanisms. Finally, chapter 12 reviews some big data and HPC research challenges. There was no place for theoretical discussions and architectural descriptions of the “convergence trajectory.”
Despite its useful encyclopedic nature in big data computing, the unrealized claims degrade its quality.