in Blog

February 21, 2023

What is unsupervised learning? Definition and examples


Artur Haponik

CEO & Co-Founder

Reading time:

9 minutes

Current advancements in Artificial Intelligence (AI) and Machine Learning (ML) could potentially increase global GDP by 14% by 2030. [1] However, to achieve this, organizations must train and deploy robust ML models capable of achieving set objectives.

One of the biggest challenges facing IT experts in training machine learning models is labeling training data. It is a costly, time-consuming process that doesn’t always enable machine learning models to reach their full potential, even with vast amounts of training data. To reach their full potential, machine learning models also need to discover the hidden patterns within their training data and exploit them. That’s where unsupervised machine learning comes in.

Unsupervised machine learning provides numerous benefits over supervised learning, including limited to no data labeling requirements. This article will explore unsupervised machine learning in its entirety, from what it is, to its examples and use cases across various industries.

What is unsupervised learning?

Unsupervised learning utilizes AI-driven algorithms to analyze and cluster unlabeled data sets. This gives unsupervised machine learning models the ability to discover hidden patterns within the data without the need for human intervention.
These unique qualities make unsupervised learning especially suitable for exploratory data analysis, customer segmentation, image recognition, and cross-selling strategy applications.

supervised vs. unsupervised learning

Examples of unsupervised learning

There are several approaches to unsupervised learning, each geared towards achieving specific objectives. Here are some of the most common examples of unsupervised learning.


Clustering is an unsupervised learning approach that groups similar data points into clusters based on their similarities and differences. This way, unsupervised machine learning models are better able to identify structures in the data, thus enabling them to understand and derive insights that would have otherwise gone unnoticed from the analysis of individual data points. [2]

clustering in machine learning

The most popular algorithms employed in clustering include:

  • K-means algorithm: This algorithm classifies data sets by dividing them into equal clusters (K clusters) of equal variances.
  • Hierarchical clustering: As the name suggests, hierarchical clustering algorithms group similar data sets into tree-like structures that represent the relationship between clusters.
  • Density-based clustering: These algorithms work by identifying data clusters as dense regions in a given data set separated by sparse regions. This gives the clusters an arbitrary shape.

This unique approach to unsupervised learning makes clustering especially suitable for machine learning applications like customer segmentation, image and text analysis, market research, and anomaly detection.

Anomaly detection

As the name suggests, anomaly detection involves identifying anomalies and outliers in a given data set. These anomalies can represent rare events, errors, and fraudulent events. These unique capabilities make anomaly detection especially helpful when training machine learning models employed in fault detection in manufacturing processes, fraud detection in financial institutions, and identifying security threats in computer networks.

anomaly detection

Source: Dominik Polzer,

Anomaly detection typically works by training algorithms on datasets without any labeled anomalies. The algorithm then uses machine learning capabilities and statistical methods to identify data points that deviate from the norm.

The most common anomalies in training data include the following:

  • Global Outliers: A global anomaly occurs when a data point deviates significantly from the average value in a data set. For instance, if a business gets a certain range of customers per day, then suddenly doubles or triples the number of customers, that would look like a global anomaly in the business’s CRM system.
  • Contextual Outliers: A contextual outlier is a significant deviation of a data point from the expected value of similar data points in the same context. For instance, gift shops get more customers during the holidays. If a gift store gets a surge in the number of customers outside the holiday season, the sudden surge can be considered a contextual outlier.
  • Collective Outliers: Collective outliers occur when a subset of data points deviate from the norm. For instance, most financial institutions record incremental growth, with a few recording losses. If multiple financial institutions were to record significant losses at around the same time, the anomaly could be described as a collective outlier.

Dimensionality reduction

Dimensionality reduction is an unsupervised learning technique that effectively reduces the number of features or dimensions in a dataset while minimizing the loss of information. This is vital for unsupervised learning as it helps mitigate the issues that arise from high dimensionality. High dimensionality occurs when datasets become too large, thus affecting the performance of machine learning models.

Dimensionality reduction falls under two categories: feature selection and feature extraction:

  • Feature Selection: Feature selection typically involves using algorithms to select a specific subset of features within a dataset that offers the most effective solution to a problem. [3] Essentially, this technique aims to identify the most relevant features with the highest predictive power for a given problem while eliminating redundant features that may impact the model’s performance.
  • Feature extraction: Feature extraction involves transforming the original features of a dataset into a new set of features with lower dimensionality while retaining all relevant information. Feature extraction algorithms achieve this through mathematical transformations that effectively project the data into a lower-dimensional space.

Association rule learning

Association rule learning is an unsupervised learning technique used to discover the relationship of items within large datasets, particularly in transaction data. This method essentially finds hidden patterns and associations between items in large datasets.

Source: Saul Dobilas,

This unique approach to dataset exploration makes association rule learning especially suitable for applications like market basket analysis, continuous production, and web mining.

Market basket analysis typically involves analyzing customer buying habits to find relations between frequently purchased items. This way, retailers are better able to increase their sales by planning their shelf area effectively and improving their selective marketing approach.

Web mining, on the other hand, involves extracting and analyzing information from the web. This includes finding associations and patterns in large datasets obtained from web pages, social media, and customer interactions on e-commerce sites. [4]

There are three types of association rule learning. They include:

  • Apriori algorithm: Apriori algorithm works on a bottom-up approach by finding frequent items in a dataset and then generating association rules from the values. Essentially, the algorithm uses a threshold value to determine the occurrence of items and only considers the items that meet the minimum threshold as frequent items.
  • FP-growth algorithm: The F-P growth algorithm takes a rather different approach to determining association rules. Unlike Apriori, the F-P algorithm uses a top-down approach by building a compact data structure called a frequent-pattern tree to represent the frequency of item sets in a dataset. This makes it significantly faster and more efficient when dealing with large datasets.
  • Eclat algorithm: Eclat, also known as Equivalence Class Transformation, is used to determine the frequency of items within a large transformation database. The algorithm works by performing a Depth-first search on the dataset, eliminating the items that don’t meet the frequency threshold, then generating a list of frequent items.

The Eclat algorithm is based on the principle of equivalence classes. This basically means that it groups transactions that contain the same items together and computes the items’ support in one step, thus avoiding the repeated database scans associated with the Apriori algorithm.


Autoencoders leverage neural network architectures to analyze datasets through a series of encoding and decoding stages. These algorithms consist of two main components; an encoder and a decoder. The encoder maps the input data into a lower dimensional representation and captures all important data points while the decoder recreates the original input from the compressed representation.

This unique mode of operation makes autoencoders especially useful in a wide range of applications, including dimensionality reduction, denoising, generative modeling, and anomaly detection. Autoencoders can also be used as building blocks for more complex models like generative adversarial networks and variational autoencoders.

Applications of unsupervised learning

Unsupervised learning techniques provide an effective exploratory way to view data, thus enabling businesses to identify patterns in large datasets. Some of the most common real-world use cases of unsupervised learning include:

Customer segmentation

Retail companies can use unsupervised learning to group customers based on their purchasing patterns and behaviors. This can help businesses better understand their customers, offer more personalized user experiences, and improve their product offerings.

Fraud detection

In 2021 alone, the US Trade Commission received more than 5.88 million fraud reports, totaling $6.1 billion, which represented a 19% increase from the previous year. [5] This clearly shows the need for financial institutions to curb fraud and identity theft cases.

Banks and other financial institutions can use unsupervised learning to identify unusual spending patterns and transactions that might be indicative of fraud or other malicious activities.

Image and video analysis

When done manually, image and video analysis can be a tedious and time-consuming process that requires a lot of human resources. Unsupervised learning can help alleviate some of these labor requirements by automatically detecting objects in videos and images. This comes in handy in training specialized machine learning models used in self-driving cars, security cameras, and medical imaging.

Wrapping up

Unsupervised learning has numerous advantages over supervised learning. For starters, unsupervised learning doesn’t have any data labeling requirements, making it faster and more practical in use cases involving large datasets of unlabeled data.
There are numerous approaches to unsupervised learning, each geared towards achieving specific objectives. The approach you choose depends on the nature of your unlabeled data sets and the type of machine learning model you aim to train. . See our MLOps consulting to find out more.


[1] Current State of AI Adoption. URL: Accessed February 17, 2023
[2] Unsupervised learning
Clustering. URL: Accessed February 17, 2023
[3] URL: Accessed February 17, 2023
[4] Association Rule Mining for Web Usage data to improve websites. URL: Accessed February 17, 2023
[5] Identity Theft Statistics. URL: Accessed February 17, 2023


Machine Learning