data mining: concepts, models, methods and algorithms, mehmed kantarzic, paperback, ieee...
TRANSCRIPT
QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL
Qual. Reliab. Engng. Int. 2005; 21:427–428
Published online in Wiley InterScience (www.interscience.wiley.com).
BOOK REVIEWS
Systems Reliability and Failure Prevention, HerbertHecht, Artech House, 2004, 230 pages, £55.00.(Originally reviewed for The Aeronautical Journalpublished by the Royal Aeronautical Society. Publishedhere with permission.)
The author is vice-chairman of a company that is involvedin consulting work on ‘high dependability systems’, andhe has worked with government and academic bodies inthe U.S.A. on the reliability and safety of such systems.This book should, therefore, be expected to present the bestand most modern ideas and methods.
Unfortunately, it disappoints. It provides competentand practical descriptions of some reliability improvementmethods, such as failure modes and effects analysisand sneak circuit analysis, redundancy design techniques,software reliability issues, and reliability programmemanagement, but there are too many omissions and evenmisleading advice for me to recommend it.
The most important omissions are an absence of anymention or discussion of accelerated test techniques, orof the role of manufacturing quality in ensuring productreliability. (In Chapter 8, the product ‘life cycle’ is statedto consist of ‘concept, development, and operation andmaintenance’. Forgetting about the essential phase ofmanufacture and its contributions is a common failingof writers on engineering reliability and management.)The author makes reference to only one other book onreliability engineering, and that was published in 1977.
The topic with the greatest potential to mislead isthe treatment of the economics of reliability, to whicha whole chapter is devoted. Much of this flies in theface of Deming’s fundamentally correct teaching thatimprovements in quality always result in lower total costs,and of the reality that forecasts of future reliability valuesalmost always entail levels of uncertainty that underminethe validity of the kinds of analyses presented.
The book includes interesting stories of some well-known system failures, and some examples of applicationsof the methods described. However, it falls well short ofbeing a definitive source for such an important topic.
PATRICK O’CONNOR
(DOI: 10.1002/qre.649)
Data Mining: Concepts, Models, Methods andAlgorithms, Mehmed Kantarzic, Paperback, IEEEPress/Wiley, 2001, xii + 345 pages, $74.95.
Machine learning, neural networks, genetic algorithms andfuzzy logic are terms which only a few years evoked aweand respect for the person uttering them. More recentlythese data mining techniques (and others), which
were largely developed within the artificial intelligencecommunity, have entered the mainstream as techniquesfor analyzing data that offer advantages over classicalstatistical methods. The advantages are particularlyrelevant to the analysis of large complex databases.Data mining, as distinguished from traditional analyticalmethods, uses automated, computationally intensive andusually non-parametric procedures to find patterns in data.When I first became interested in data mining about10 years ago, it was hard to find comprehensive literaturewhich could serve as an introduction to the topic for thenovice. In fact, much of the material was in journals wherethe presentation was very technical (i.e. a challenge tounderstand). In more recent years this situation has beenremedied with the release of a number of books aimed atintroducing people involved in data analysis to the topic.Data Mining: Concepts, Models, Methods and Algorithmsoffers a very readable and up-to-date introduction to datamining.
A point in its favor is that the book includes achapter devoted to data preparation. One cannot overstatethe importance of this step to the data mining process.The chapter discusses missing data, outlier detection andthe use of transformations such as normalization andsmoothing. A rule of thumb is that the data management,cleaning and transformation processes will consume 90%or more of the data mining effort. Another unique featureof the book is a chapter devoted to data reduction.Given the massive databases analyzed in data miningprojects, methods are often needed to reduce the numberof records (through sampling) and the number of variables(through variable selection or combining variables toproduce a smaller number of variables in total).
The book reviews the major data mining methods.The survey includes a brief overview of classicalapproaches such as regression and discriminant analysis,though the potential reader should note that knowledgeof conventional statistical methods is a prerequisitefor understanding much of the material in the book.Data mining methods introduced include clustering,association rules, neural networks, decision trees, geneticalgorithms and fuzzy logic. This is a pretty comprehensiveand up-to-date collection of methods. A final chapter onvisualization presents some useful and, in some cases,relatively recently developed methods used to analyze andpresent data graphically.
In general, the level of discussion in the book isintroductory and can be followed by those new to thediscipline. Relatively simple examples are used to motivatean understanding of the methods. However, there were afew places where I needed to read a passage a few times
Copyright c© 2005 John Wiley & Sons, Ltd.
428 BOOK REVIEWS
in order to understand the material, and I am already wellacquainted with the subject. A set of review problems andexercises contained at the end of each chapter can be usefulto the instructor using the book as a text for an introductorycourse on data mining. An additional helpful feature is theinclusion of appendices with extensive lists of Web sitescontaining data, software and information useful to dataminers.
At the end of each chapter is an annotated list ofreferences that provides sources for further study for thosewho want to pursue a topic in more detail. A few of myfavorite texts did not appear, but I approach data miningwith a statistician’s perspective and view every data miningmethod as an augmentation of a statistical procedure I amalready familiar with, while the author of the book is fromthe computer science discipline.
In addition to serving as a comprehensive introductionto students and practitioners unfamiliar with data mining,the book has something to offer to those already applyingdata mining methods because it is thorough and coverssome methods that other data mining survey booksdo not.
Kantarzic’s Data Mining will not supplant Hastie,Tibshirani and Friedman’s The Elements of StatisticalLearning as the reference on data mining (at leastfor statisticians), but it provides a less-challengingintroduction to the topic for those who do not already haveextensive exposure to statistical methods for analyzingdata.
LOUISE FRANCIS
(DOI: 10.1002/qre.704)
Copyright c© 2005 John Wiley & Sons, Ltd. Qual. Reliab. Engng. Int. 2005; 21:427–428