3. Feature Selection

3. Feature Selection#

Feature selection is the process of choosing the most relevant features for a machine learning model and removing irrelevant or redundant variables.

Reducing unnecessary features can help:

  • improve model performance

  • reduce overfitting

  • decrease training time

  • make models easier to interpret

Example#

Original features:

Size

Bedrooms

Zip Code

Price

If Zip Code does not contribute to predicting house price, it may be removed.

Selected features:

| Size | Bedrooms | Price |

Common Approaches#

Method

Description

Filter Methods

Use statistical measures such as correlation to select features

Wrapper Methods

Evaluate different feature combinations using a model

Embedded Methods

Feature selection occurs during model training (e.g., Lasso, decision trees)