Ian H. Witten, Eibe Frank, Mark A. Hall - Data Mining

The convergence of computing and communication has produced a society that feeds on information. Yet most of the information is in its raw form: data. If data is char- acterized as recorded facts, then information is the set of patterns, or expectations, that underlie the data. There is a huge amount of information locked up in data- bases—information that is potentially important but has not yet been discovered or articulated. Our mission is to bring it forth. Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. The idea is to build computer programs that sift through databases automatically, seeking regularities or patterns. Strong patterns, if found, will likely generalize to make accurate predictions on future data. Of course, there will be problems. Many patterns will be banal and uninteresting. Others will be spurious, contingent on accidental coincidences in the particular dataset used. And real data is imperfect: Some parts will be garbled, some missing. Anything that is discovered will be inexact: There will be exceptions to every rule and cases not covered by any rule. Algorithms need to be robust enough to cope with imperfect data and to extract regularities that are inexact but useful

Data Science/Analysis Загрузок: 0