Computing Reviews

Scalable big data analytics for protein bioinformatics :efficient computational solutions for protein structures
Mrozek D., Springer International Publishing,New York, NY,2018. 315 pp.Type:Book
Date Reviewed: 06/06/19

High-performance computing (HPC) refers to the use of large computational resources for solving computationally hard and data-intensive problems. Big data refers to “the exponential growth ... of data, both structured and unstructured.” The challenges include data curation, data manipulation, storage, sharing, analysis, and visualization. For protein bioinformatics, big data is common and requires efficient and complex solutions for resolving protein structures.

Scalable big data analytics for protein bioinformatics addresses problems in protein similarity searching for 3D protein structure prediction. The book has four parts. Part 1 introduces the protein structure along with the techniques used later in the book. This part (chapters 1 and 2) is vital for computational scientists who do not have a background in biology and biophysics.

The second part discusses the building up of cloud services that are used in the development of cloud applications for 3D protein structure prediction. It focuses on Microsoft Azure, but the described approaches may be conceptually useful for other public or private clouds.

Part 3 deals with big data, especially big data frameworks like Hadoop and Spark. The most interesting chapter (10) provides valuable insight into massively parallel protein structure searching using graphics processing unit (GPUs) in cloud environments.

Overall, this excellent and practically oriented text can benefit researchers seeking to establish a cloud-based bioinformatics HPC facility. Note that most of the solutions are implemented as embarrassingly parallel processes and not as distributed parallel processes. The book will be of interest to researchers and scientific software developers of bioinformatics and biomedical databases.

Reviewer:  Alexander Tzanov Review #: CR146592 (1909-0331)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy