Unsupervised Learning: Discovering Patterns Without Answers

Unsupervised Learning: Discovering Patterns Without Answers#

A Different Kind of Learning#

So far, most of our machine learning tasks follow a familiar pattern:

Inputs (features)
Outputs (labels)

The model learns a mapping from input → output. This is called supervised learning. But now consider a different situation.

You are given a dataset with many data points and features—but no labels. No correct answers. No categories. Just raw data. This raises an important question:

Can we learn something meaningful just from the structure of the data itself?

Figure: Unsupervised learning groups unlabeled data into patterns based on similarity. _{Source: MathWorks}

This is where unsupervised learning begins.

2. The Problem of Too Many Features#

Consider analyzing customer behavior with features like:

Age
Income
Purchase frequency
Website activity

Now scale this to:

50, 100, or even 500 features

We quickly face challenges:

Visualization becomes impossible
Computation slows down
Many features are redundant or noisy
Relationships become hard to interpret

The data exists in a high-dimensional space, beyond our intuitive understanding. So we ask:

Can we simplify the data while preserving its important information?

Part A: Dimensionality Reduction#

Dimensionality reduction addresses this challenge.

It reduces the number of features while preserving the most important structure.

Think of it like:

Summarizing a long story
Compressing an image

Less data, but same essential meaning

One of the most powerful techniques for this is: Principal Component Analysis (PCA)

The Curse of Dimensionality#

Before PCA, we must understand why high dimensions are problematic.

The Curse of Dimensionality#

Figure: Hughes Phenomenon: Adding features helps initially, but too many features with limited data reduces performance. _{Source: Medium}

As dimensions increase:

1. Data Becomes Sparse#

Points spread far apart, making the space mostly empty.

2. Distance Becomes Less Meaningful#

Nearest and farthest points become similar
Hard to distinguish similarity

Figure: Distances lose meaning as dimensionality increases. _{Source: Medium}

3. More Data is Required#

High dimensions require exponentially more data to learn effectively.

4. Noise and Redundancy Increase#

Irrelevant features
Correlated features
Added noise

Intuition#

2D → easy to understand
3D → harder
100D → nearly impossible to reason about

High-dimensional space behaves very differently from what we expect.

Why This Matters#

Because of these effects:

Model performance can degrade
Computation becomes inefficient
Interpretation becomes difficult

This is why we use dimensionality reduction

Reduce complexity while preserving structure

And one of the most important tools for this is: PCA

Unsupervised Learning: Discovering Patterns Without Answers

Contents

Unsupervised Learning: Discovering Patterns Without Answers#

A Different Kind of Learning#

2. The Problem of Too Many Features#

Part A: Dimensionality Reduction#

The Curse of Dimensionality#

The Curse of Dimensionality#

1. Data Becomes Sparse#

2. Distance Becomes Less Meaningful#

3. More Data is Required#

4. Noise and Redundancy Increase#

Intuition#

Why This Matters#