What is Data Mining? Applications

Data mining is a process of filtering all the unwanted information and noise from the set of data points and make it useful for further processing. With the advent of the Internet of things and cloud, smart devices are able to generate a magnanimous amount of data. But all these overflooded data does not mean all of them is useful. All the chaotic data is removed and only the relevant information is retained. Hence it adds speed of making informed decisions.

It is a process where patterns or correlation is found within a large dataset to predict the final result. It essentially is a field that comes under Machine learning and uses techniques to build Machine Language models that power modern Artificial Intelligence applications.

Steps in Data Mining

• Business Understanding – After a clear understanding of business requirements and their specifications are made the next step is to develop a plan to include timeliness and role assignments.

• Data understanding – A huge number of data are collected from various sources, databases and smart devices. There are various data visualization tools which can be used to make the data presentable and present it in a format which will be both appealing and useful for business goals.

• Data Preparation – After the data is filtered and cleaned missing data is incorporated. It is a time-consuming process depending upon the number of different sources and the amount of data collected. Therefore, distributed systems or databases are used to ease this process. In these types of distributed systems, data can be stored in a structured manner and later retrieved by querying it. They also provide more security than dumping the entire data into a single data warehouse.

• Data Modelling – Many statistical and mathematical tools are used to find patterns and anomalies in the data set.

• Evaluation – The findings of all these processing using sophisticated tools are compared with the business objectives to determine if they are an effective solution to be deployed across the organisation.

• Deployment – The findings have to be finally deployed across the business or organisation so that they can be benefited from these data.

Benefits of Data Mining

• Automated Decision Making – Organised and processed data help organisations and businesses to analyse their data continually and automate their custom and non-custom decisions without the intervention or delay of any human factor.

• Accuracy in prediction and forecasting – Data mining processes raw data in such a manner that it can give predictable and correct forecasting of business trends. This can help them plan their strategy and future goals.

• Cost reduction – With the ease of early prediction businesses can forecast any unforeseen risk and overheads and hence take preventive measures much in advance to reduce their operating cost and hence it improves their efficiency and productivity in turn.

• Customer Insights – Businesses and organisations can make use of Data Mining models to correlate and make out key differences between different potential customers. The businesses can then improve on each touchpoint and hence increase overall customer satisfaction.

Types of Data Mining

It can be classified as

  1. Supervised learning
  2. Unsupervised learning

Supervised Learning – It is based on prediction or classification. The process looks out for predicting a single output variable. For example, filtering of spam from normal messages is based on supervised learning. Models used for supervised data mining process are:

• Linear regression – Value of the output value is based on one or more independent variables.

• Logistics regression – This type of regression analysis is applied to predict the probability of a categorical variable based on one or more independent variables.

• Time-series – These are forecasting tools which use time as the independent variable.

• Classification or Regression trees – This type of regression analysis can predict both categorical and continuous target variables.

• Neural Networks – It is inspired by the human brain system and neuron architecture. Neural networks process the input and depending on their magnitude they may or may not “fire” its node based on its threshold value.

• K-nearest neighbour – It analyses new observation based on old observations. It is data-driven and not model-driven.

Unsupervised Learning – It is based on understanding the underlying patterns of the given data and revealing them. Models used for unsupervised data mining process are:

• Clustering – It groups similar data in a cluster. They can be used to process complex data sets which can reflect a single data entity.

• Association analysis – It is based on market-based analysis and can identify items that often occur together.

• Principal Component analysis – It can reflect on the hidden correlations between input variables. Then they can create new variables which are called principal components. These principal components reflect the same information as in the original data but with fewer variables.

Trends in Data mining

• Language standardisation – Trends are being evolved to have a common standardised language for data mining just like SQL as a standard for the database.

• Scientific mining – Data mining success have allured various researchers to use it in their scientific and academic research. They are using association analysis to identify a broader pattern in human behaviour.

• Complex data objects – New methods in data mining are continuously evolved to analyse complex data types. For example, in Google search user can input image in place of text to search.

• Increased computing speed – As the size and complexity of data increases, data mining requires faster and more powerful computing machines to analyse the data.

• Web mining – Web mining uses similar tools and techniques like Data Mining and use them on the internet. The three web mining tools are content mining, structure mining and usage mining.

Applications of Data mining

• Communication – Telecommunication business and media can use Data Mining analytical models to study their huge amount of data and help provide strategies to increase customer satisfaction.

• Education – Predicting student’s overall performance using the data-driven analytical and prediction model of Data Mining can be very helpful in educational institutes to track individual student performance track record.

• Banking – Understanding customer base and billions of transactions happening every instant have become much easier and faster by using Data Mining technologies. Now a better view of market trends and risks or any fraud detection have become faster due to Data Mining predictive analysis tools.

• Insurance – With the extensive exploitation of Data mining models and tools insurance companies can solve complex problems related to frauds or risk management more easily and effectively. Insurance companies can use data mining techniques to fix the accurate price of their insurance packages and convince their customers.

• Manufacturing – Supply Chain management systems extensively make usage of Data Mining predictive and analytical tools to bridge the supply and demand gap. Quality assurance and risk management can all be done by the data mining tools.

• Retail – Data Mining models and prediction can help improve the retail market to improve their sales, increase customer retention and customer satisfaction, Apply marketing strategies and forecast sales.

• Healthcare – Researchers and medical practitioners use data mining approaches like soft computing, machine learning, statistics and data visualisation to improve healthcare systems. Patients can be diagnosed for complex diseases easily and diagnosis and proper treatments can be provided accordingly.

• Lie detection – Law bodies can use Data Mining approaches to bring out the truth from any hard-core criminal. It can also be used to investigate crimes, monitor spurious terrorist activities and other security-related operations.


Anupama kumari

M.Tech (VLSI Design and Embedded system)

BS Abdur Rahman University

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.