Using Existing Architectures
How to choose pre-built model architectures
Most of the time, we do not build models from scratch.
Instead, we use existing model architectures created by others. We choose them based on:
- the type of data (e.g. images, text, numbers)
- the task (e.g. classification, prediction, generation)
The task defines the goal. The goal defines how we measure success (loss). The data type defines which layers are useful.
That is why most AI people are AI engineers.
Example: Using a Pretrained Model
The easiest way is to use a model that is already trained:
import torch
import torch.nn as nn
from torchvision import models
class ResNetClassifier(nn.Module):
def __init__(self):
super().__init__()
# Load a pretrained ResNet
self.model = models.resnet18(pretrained=True)
# Replace the last layer (final filter)
self.model.fc = nn.Sequential(
nn.Linear(self.model.fc.in_features, 1),
nn.Sigmoid() # binary classification
)
def forward(self, x):
return self.model(x)This uses a pretrained model that someone else already trained on millions of images. We only change the last layer to fit our specific task.
Building from Scratch
If a developer wants to build it completely from scratch, they use the ideas that someone else had on how the AI model works mathematically and then implement it.
Here is a ResNet-style architecture built from scratch:
import torch
import torch.nn as nn
import torch.nn.functional as F
class ConvBlock(nn.Module):
def __init__(self, in_channels, out_channels, stride=1):
super(ConvBlock, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
self.bn = nn.BatchNorm2d(out_channels)
self.relu = nn.ReLU(inplace=True)
self.downsample = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(out_channels))
def forward(self, x):
identity = self.downsample(x)
out = self.conv(x)
out = self.bn(out)
out += identity
out = self.relu(out)
return out
class IdentityBlock(nn.Module):
def __init__(self, channels):
super(IdentityBlock, self).__init__()
self.conv = nn.Conv2d(channels, channels, kernel_size=3, stride=1, padding=1, bias=False)
self.bn = nn.BatchNorm2d(channels)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
identity = x # Shortcut connection
out = self.conv(x)
out = self.bn(out)
out += identity # Element-wise addition
out = self.relu(out)
return out
class ResNet(nn.Module):
def __init__(self, num_classes=10):
super(ResNet, self).__init__()
self.stage1 = nn.Sequential(
nn.ZeroPad2d(1),
nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=0, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=2, stride=2)
)
self.stage2 = nn.Sequential(
ConvBlock(64, 128, stride=2),
IdentityBlock(128)
)
self.stage3 = nn.Sequential(
ConvBlock(128, 256, stride=2),
IdentityBlock(256)
)
self.stage4 = nn.Sequential(
ConvBlock(256, 512, stride=2),
IdentityBlock(512)
)
self.stage5 = nn.Sequential(
ConvBlock(512, 1024, stride=2),
IdentityBlock(1024)
)
self.classifier = nn.Sequential(
nn.AdaptiveAvgPool2d((1,1)),
nn.Flatten(),
nn.Linear(1024, num_classes)
)
def forward(self, x):
x = self.stage1(x)
x = self.stage2(x)
x = self.stage3(x)
x = self.stage4(x)
x = self.stage5(x)
x = self.classifier(x)
return xThis implements the ResNet architecture from scratch and equals the example above. Just a bit more complex. The architecture has multiple stages with convolutional blocks.
Less than 1% of all AI developers create own AI architectures. So even if someone creates models from scratch, mostly they rebuild what is already there and change a couple of layers so it suits their purpose.
It is like a window builder that uses an existing drill to build windows. It would be crazy to build the drill himself just so he can build windows. Building windows is already hard enough to master on its own. That's why AI also has a lot of subfields that all have their justification.
Building from scratch requires understanding the mathematical principles. Using pretrained models is much faster and easier.