Core Concepts & Tensor Operations
Creating Tensors:
torch.tensor(data): Create a tensor from data (list, tuple, array).
torch.zeros(size): Create a tensor filled with zeros.
torch.ones(size): Create a tensor filled with ones.
torch.rand(size): Create a tensor with random values (uniform distribution).
torch.randn(size): Create a tensor with random values (normal distribution).
torch.empty(size): Create an uninitialized tensor.
|
Tensor Attributes:
.shape: Returns the shape of the tensor.
.dtype: Returns the data type of the tensor.
.device: Returns the device on which the tensor is stored (CPU or GPU).
|
Moving Tensors:
.to(device): Moves the tensor to the specified device (e.g., torch.device('cuda')).
.cpu(): Moves the tensor to the CPU.
.cuda(): Moves the tensor to the GPU.
|
Arithmetic:
torch.add(a, b) or a + b: Element-wise addition.
torch.sub(a, b) or a - b: Element-wise subtraction.
torch.mul(a, b) or a * b: Element-wise multiplication.
torch.div(a, b) or a / b: Element-wise division.
torch.pow(a, b) or a ** b: Element-wise exponentiation.
|
Matrix Operations:
torch.matmul(a, b) or a @ b: Matrix multiplication.
torch.transpose(a, dim0, dim1): Transpose the tensor.
torch.inverse(a): Inverse of a matrix.
torch.det(a): Determinant of a matrix.
|
Slicing and Indexing:
a[index]: Accessing a single element.
a[start:end]: Slicing a tensor.
a[mask]: Indexing with a boolean mask.
torch.gather(input, dim, index): Gathers values along an axis specified by dim.
|
Reshaping:
a.view(new_shape): Reshapes the tensor without changing its data.
a.reshape(new_shape): Returns a tensor with the same data and number of elements as input, but with the specified shape.
a.squeeze(): Removes dimensions of size one.
a.unsqueeze(dim): Adds a dimension of size one at the specified position.
|
Automatic Differentiation:
requires_grad=True: Enable gradient tracking for a tensor.
.backward(): Compute gradients of a tensor with respect to the graph leaves.
.grad: Access the computed gradients.
with torch.no_grad():: Disable gradient calculation within a block.
|
Example:
x = torch.randn(3, requires_grad=True)
y = x + 2
z = y * y * 2
z = z.mean()
z.backward()
print(x.grad) # Gradients of z w.r.t. x
|
Model Building & Training
Using torch.nn.Module:
Models are defined as classes that inherit from torch.nn.Module. The forward pass is defined in the forward method.
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 3)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 3)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
|
torch.nn.CrossEntropyLoss(): Commonly used for multi-class classification.
torch.nn.MSELoss(): Mean Squared Error loss, used for regression.
torch.nn.BCELoss(): Binary Cross Entropy loss, used for binary classification.
torch.nn.L1Loss(): L1 Loss (Mean Absolute Error).
|
Example:
import torch.nn as nn
loss_fn = nn.CrossEntropyLoss()
output = model(input)
loss = loss_fn(output, target)
loss.backward()
|
torch.optim:
PyTorch provides various optimization algorithms.
torch.optim.SGD(params, lr, momentum=0): Stochastic Gradient Descent.
torch.optim.Adam(params, lr, betas=(0.9, 0.999), eps=1e-08): Adam optimizer.
torch.optim.RMSprop(params, lr, alpha=0.99, eps=1e-08): RMSprop optimizer.
|
Example:
import torch.optim as optim
optimizer = optim.Adam(model.parameters(), lr=0.001)
optimizer.zero_grad()
output = model(input)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
|
Typical Training Loop:
for epoch in range(num_epochs):
for i, (inputs, labels) in enumerate(train_loader):
# Move data to device
inputs = inputs.to(device)
labels = labels.to(device)
# Zero the parameter gradients
optimizer.zero_grad()
# Forward pass
outputs = model(inputs)
loss = criterion(outputs, labels)
# Backward and optimize
loss.backward()
optimizer.step()
# Print statistics
if (i+1) % 100 == 0:
print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
.format(epoch+1, num_epochs, i+1, len(train_loader), loss.item()))
|
Data Loading and Preprocessing
torch.utils.data.Dataset:
Base class for all datasets in PyTorch. You can create custom datasets by inheriting from this class and overriding the __len__ and __getitem__ methods.
|
Example:
from torch.utils.data import Dataset
from PIL import Image
import os
class CustomDataset(Dataset):
def __init__(self, root_dir, transform=None):
self.root_dir = root_dir
self.image_paths = [os.path.join(root_dir, file) for file in os.listdir(root_dir) if file.endswith('.png')]
self.transform = transform
def __len__(self):
return len(self.image_paths)
def __getitem__(self, idx):
image_path = self.image_paths[idx]
image = Image.open(image_path).convert('RGB')
if self.transform:
image = self.transform(image)
label = 0 # Replace with your label loading logic
return image, label
|
torch.utils.data.DataLoader:
Provides an iterable over the dataset, with features like batching, shuffling, and parallel data loading.
dataset: The Dataset object to load data from.
batch_size: How many samples per batch to load.
shuffle: Set to True to have the data reshuffled at every epoch.
num_workers: How many subprocesses to use for data loading.
|
Example:
from torch.utils.data import DataLoader
dataset = CustomDataset(root_dir='data', transform=transform)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True, num_workers=4)
for images, labels in dataloader:
# Process batch
pass
|
torchvision.transforms:
Provides common image transformations for preprocessing data.
transforms.ToTensor(): Convert a PIL Image or NumPy ndarray to tensor.
transforms.Normalize(mean, std): Normalize a tensor image with mean and standard deviation.
transforms.Resize(size): Resize the input image to the given size.
transforms.RandomHorizontalFlip(): Horizontally flip the given PIL Image randomly with a given probability.
transforms.Compose(transforms): Composes several transforms together.
|
Example:
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
|