Business PME Business PME is a gate of free information bound for the companies in the United States of America. This website offers thousands of contents as well as a companies directory. The group’s other BtoB websites   --  Professional Networking Friday Janu. 9th 2009 Search
articles
Search
companies

Data mining



Data mining (DM), also called Knowledge-Discovery in Databases (KDD) or Knowledge-Discovery and Data Mining, is the process of automatically searching large volumes of data for patterns such as association rules. It is a fairly recent topic in computer science but applies many older computational techniques from statistics, information retrieval, machine learning and pattern recognition.


Example

A simple example of data mining, often called Market Basket Analysis, is its use for retail sales. If a clothing store records the purchases of customers, a data mining system could identify those customers who favour silk shirts over cotton ones.


 


Another is that of a supermarket chain who, through analysis of transactions over a long period of time, found that beer and diapers were often bought together. Although explaining this relationship may be difficult, taking advantage of it is easier, for example by placing the high-profit diapers in the store close to the high-profit beers. (This example is questioned at Beer and Nappies -- A Data Mining Urban Legend.)


Use of the term

Data mining has been defined as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" and "the science of extracting useful information from large data sets or databases".


 


It involves sorting through large amounts of data and picking out relevant information.


 


It is usually used by businesses and other organizations, but is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimentation.


 


Metadata, or data about a given set of data, are often expressed in a condensed data mine-able format, or one that facilitates the practice of data mining. Common examples include executive summaries and scientific abstracts.


 


Although data mining is a relatively new term, the technology is not. Companies for a long time have used powerful computers to sift through volumes of data such as supermarket scanner data, and produce market research reports. Continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy and usefulness of analysis.


 


Data mining identifies trends within data that go beyond simple analysis. Through the use of sophisticated algorithms, users have the ability to identify key attributes of business processes and target opportunities.


Related terms

Although the term "data mining" is usually used in relation to analysis of data, like artificial intelligence, it is an umbrella term with varied meanings in a wide range of contexts. Unlike data analysis, data mining is not based or focused on an existing model which is to be tested or whose parameters are to be optimized.


 


In statistical analyses where there is no underlying theoretical model, data mining is often approximated via stepwise regression methods wherein the space of 2k possible relationships between a single outcome variable and k potential explanatory variables is smartly searched. With the advent of parallel computing, it became possible (when k is less than approximately 40) to examine all 2k models. This procedure is called all subsets or exhaustive regression. Some of the first applications of exhaustive regression involved the study of plant data.

Copyright 2008 - France BtoB from Wikipédia