Data Mining

Data mining is the computing process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.

Data mining involves six common classes of tasks:

  1. Anomaly detection

  2. Association rule learning

  3. Clustering

  4. Classification

  5. Regression

  6. Summarization

It can be referred as an interdisciplinary subfield of computer science. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
It also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
The term is a misnomer, as the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself.
You can refer The book Data mining: Practical machine learning tools and techniques with Java (which covers machine learning materials) and was originally to be named just Practical machine learning, The term data mining was only added for marketing reasons.
The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records and dependencies which usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *