Computing Reviews

Statistical and machine-learning data mining :techniques for better predictive modeling and analysis of big data (3rd ed.)
Ratner B., CRC Press, Inc.,Boca Raton, FL,2017. 696 pp.Type:Book
Date Reviewed: 04/12/19

There are numerous books about statistics, from short recipe collections to larger theoretical works. There are texts devoted to data science and data mining, the newest expressions of data analysis, which focus on computational techniques. Ratner addresses the recent growth of data science and data mining in his new chapter 2, and ends up merging those terms back into the parent discipline of statistics. Having dealt with terminology, he goes on to present and analyze the statistical practice and problems of today, while managing to be both personable and entertaining.

The text weighs in at 44 chapters and roughly 650 pages. I’m not going to list and describe every chapter; you can find the table of contents for both the second edition and this edition using Amazon’s “Look Inside” program.

While the first edition was a bestseller, reviewers of the second edition commented that it seemed disjointed and suffered from a lack of algorithms. Ratner added 13 new chapters to produce this third edition, and states in the preface his goals: extend the core material; improve the writing and continuity; and provide his statistical subroutines, now available for download. This edition should therefore be considered both a corrected and extended version of the second edition.

In addition to chapter 2’s examination of terminology, other added chapters delve into specialized applications of data mining, like market segmentation and dealing with missing data. Also included are chapters on “Art, Science, Numbers, and Poetry” and “Opening the Dataset: A Twelve-Step Program for Dataholics” and a primer on text mining. The author even includes a chapter of his favorite statistical subroutines (written for SAS, but understandable enough to translate to other formats), as well as links to more code and examples.

The text is obviously a labor of love from a dedicated statistician. It may not be organized properly for a class textbook--the author assumes statistical knowledge, prefers to investigate difficult applications without accepted solutions, and has no interest in delivering a systematic understanding of statistics--but teachers would definitely find excerpts to cite. Given the casual writing style, the nonacademic tone, and the author’s desire to explore unsolved problems of his own choosing, readers will either love it or hate it. All practitioners should take a look; it would be a shame to miss a potential favorite.

Reviewer:  Bayard Kohlhepp Review #: CR146529 (1907-0264)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy