19.4 Training CNNs#
The training process for CNNs is similar to that of traditional feedforward neural networks (FFNNs) and involves several key steps:
Feedforward pass: The input data passes through convolutional, activation, pooling, and fully connected layers to produce a prediction.
Loss computation: A loss function (such as cross-entropy for classification) measures the difference between the predicted output and the true label.
Backpropagation: Gradients of the loss with respect to all learnable parameters (filters, biases, weights) are calculated using the chain rule.
Parameter update: Optimizers like Stochastic Gradient Descent (SGD) or Adam adjust the parameters to minimize the loss.
To improve training and prevent overfitting, CNNs often include:
Dropout: Randomly disables a fraction of neurons during training to encourage the network to learn more robust features.
Batch Normalization: Normalizes layer inputs within each mini-batch to speed up training and improve stability.
Through many iterations of this process, the filters in the convolutional layers learn to extract increasingly abstract and useful features from the input data, enabling the model to make accurate predictions.
A simple CNN code snipets using Pytorch is given below:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
# Define the CNN model
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
self.pool = nn.MaxPool2d(2, 2)
self.dropout = nn.Dropout(0.25) # dropout with 25% probability
self.fc1 = nn.Linear(32 * 13 * 13, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = x.view(-1, 32 * 13 * 13)
x = self.fc1(x)
return x
# Device setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Transformations: ToTensor and Normalize
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
# Load MNIST dataset (auto download if not present)
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
# Data loaders
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)
# Initialize model, loss, optimizer
model = SimpleCNN().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop example (1 epoch)
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print(f'Train Batch: {batch_idx}, Loss: {loss.item():.4f}')
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 import torch
2 import torch.nn as nn
3 import torch.nn.functional as F
4 import torch.optim as optim
ModuleNotFoundError: No module named 'torch'