Types of Regression

Types of Regression#

As we move from simple problems to more complex ones, different types of regression models are used depending on the nature of the data and the relationship.

1. Simple Linear Regression#

The simplest form of regression is linear regression with one variable, where we assume a straight-line relationship between input and output.

we model the relationship between one feature and the target using a straight line. Example: Predicting salary based on years of experience

One input variable
Assumes a linear relationship
Easy to interpret

The model is written as:

\[y = mx + b\]

Where:

\(y\) is the predicted value
\(x\) is the input feature
\(m\) is the slope (how much \(y\) changes with \(x\))
\(b\) is the intercept (value of \(y\) when \(x=0\))

A Simple View of Linear Regression. Source:Medium.com

Intuition#

Think of this as fitting the “best possible straight line” through your data points. The model tries to capture the overall trend in the data.

For example:

If house size increases, price usually increases → positive slope
If study hours increase, exam score increases → positive slope

Key Insight: The model does not try to pass through every point. Instead, it minimizes the overall error across all points.

2. Multiple Linear Regression#

In reality, most outcomes depend on multiple factors, a single feature is rarely sufficient. Multiple linear regression extends the idea of a straight line to multiple dimensions.

Multiple input variables
Each feature contributes to the prediction
Helps capture more realistic scenarios

Example: Predicting house price depends on:

Size
Number of bedrooms
Location
Age of the property

This leads to multiple linear regression, where we use several features:

\[y = b_0 + b_1x_1 + b_2x_2 + \dots + b_nx_n\]

Where:

\(n\) is the degree of polynomial
\(b_0\) is the intercept
\(b_1, b_2, \dots, b_n\) are coefficients
Each \(x_i\) is a feature

Each feature contributes independently to the prediction.

Interpretation#

One of the most powerful aspects of multiple regression is interpretation:

\(b_1\): effect of \(x_1\) while keeping all other variables constant
\(b_2\): effect of \(x_2\), and so on

This allows us to answer questions like:

“How much does price increase per extra bedroom, holding size constant?”

Important Note#

Features should not be highly correlated with each other (multicollinearity), as it can make interpretation unstable.

3. Non-Linear Relationships and Polynomial Regression#

Not all relationships are linear. Sometimes the data curves Polynomial regression allows the model to fit curves instead of straight lines.

Captures non-linear patterns
Still based on linear modeling techniques
Risk of overfitting if degree is too high

Example:

Growth may accelerate over time
Costs may increase at an increasing rate

To handle this, we use polynomial regression:

\[ y = a + bx + cx^2 + dx^3 + \dots\]

A Simple View of Polynomial Regression. Source: Statisticalaid.com

Even though the curve looks non-linear, the model is still linear in parameters, which allows us to train it using the same techniques as linear regression.

Tradeoff#

Low degree → underfitting
High degree → overfitting

Choosing the right degree is critical.

Python Implementation#

In practice, regression is often implemented using libraries such as:

### Basic Linear Regression

```python
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# train model
model = LinearRegression()
model.fit(X_train, y_train)

# predictions
predictions = model.predict(X_test)


### Basic Polynomial Regression

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

model = LinearRegression()
model.fit(X_poly, y)