HUY NGUYEN

Pytorch

October 1, 2024

Introduction

Developed by Meta’s Fundamental AI Research (FAIR) lab, PyTorch has become the most widely used deep learning framework. Its GitHub repository boasts over 82,400 stars and more than 22,000 forks, showcasing its immense popularity. Backed by a large and active community, PyTorch offers extensive support through its vibrant discussion forums, enabling users to quickly resolve issues and streamline debugging. Major tech companies, including Apple, Meta, and TikTok, leverage PyTorch for their machine learning projects. Its growing popularity is also fueled by platforms like PyTorch Lightning and Hugging Face, which simplify code organization and provide access to state-of-the-art models, making it easier than ever for users to harness its power.

So let’s dive in and discover what Pytorch has to offer !

Note that this article is based on this Pytorch tutorial. I have summarized and added some more spices of my knowledge.

Tensors

Tensors are multi-dimensional data similar to matrices in numpy, but with additional attributes that enable parallel computing to accelerate calculations.

First, let's import necessary librairies:

import torch
import numpy as np

Tensors can be created in various ways. For example, we can create one directly from a list or a nested list:

data = [[1, 2], [3, 4]]
data_tensor = torch.tensor(data)

From a numpy array:

data_array = np.array(data)
data_array_tensor = torch.from_numpy(data_array)

From another tensor:

ones_tensor = torch.ones_like(data_tensor)
random_tensor = torch.rand_like(data_tensor, dtype=torch.float)

print(f"Ones tensor: \n{ones_tensor}")
print(f"Random tensor: \n{random_tensor}")
Ones tensor: 
tensor([[1, 1],
        [1, 1]])

Random tensor:
tensor([[0.2302, 0.7488],
        [0.0755, 0.3460]])

With random or constant values:

shape = (2, 4,)
random_normal_tensor = torch.randn(shape)
new_ones_tensor = ones_tensor.new_ones(shape, dtype=torch.double)
empty_tensor = torch.empty(shape)

print(f"Random normal tensor: \n{random_normal_tensor}")
print(f"New ones tensor: \n{new_ones_tensor}")
print(f"Empty tensor: \n{empty_tensor}")
Random normal tensor:
tensor([[ 0.5631, -0.0305, -1.2209, -0.8312],
        [ 0.6690, -0.6183,  0.3573, -0.4407]])

New ones tensor:
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]], dtype=torch.float64)
    
Empty tensor:
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]])

Attributes

tensor = torch.rand(4, 7)

print(f"Shape of tensor: {tensor.shape}")
print(f"Size of tensor: {tensor.size()}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Type of tensor: {type(tensor)}")
print(f"Device tensor is stored on: {tensor.device}")
Shape of tensor: torch.Size([4, 7])
Size of tensor: torch.Size([4, 7])
Datatype of tensor: torch.float32
Type of tensor: <class 'torch.Tensor'>
Device tensor is stored on: cpu

Operations

There are over 100 tensor operations, for example transposing, indexing, slicing, mathematical operations, linear algebra. They are available here.

Addition two tensors, element-wise

x = torch.randn(3, 3)
y = torch.ones(3, 3)

print(f"x: \n{x}")
print(f"y: \n{y}")
print(f"1. x + y: \n{x + y}")
print(f"2. x + y: \n{torch.add(x, y)}")
print(f"3. x + y: \n{x.add(y)}")
x:
tensor([[ 0.4901,  0.3201, -0.1917],
        [ 0.2385,  1.0622,  1.7395],
        [ 1.4905,  0.3360, -0.0343]])

y:
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

1. x + y: 
tensor([[1.4901, 1.3201, 0.8083],
        [1.2385, 2.0622, 2.7395],
        [2.4905, 1.3360, 0.9657]])
    
2. x + y: 
tensor([[1.4901, 1.3201, 0.8083],
        [1.2385, 2.0622, 2.7395],
        [2.4905, 1.3360, 0.9657]])
    
3. x + y: 
tensor([[1.4901, 1.3201, 0.8083],
        [1.2385, 2.0622, 2.7395],
        [2.4905, 1.3360, 0.9657]])

In-place operations

x = torch.randn(3, 3)
y = torch.ones(3, 3)

print(f"x: \n{x}")
print(f"In-place addition: \n{x.add_(y)}")
print(f"x: \n{x}")
x: 
tensor([[-1.6163, -1.6534, -1.0660],
        [ 2.2851, -0.5562,  0.0684],
        [ 1.5171, -0.8063,  1.4790]])

In-place addition: 
tensor([[-0.6163, -0.6534, -0.0660],
        [ 3.2851,  0.4438,  1.0684],
        [ 2.5171,  0.1937,  2.4790]])

x: 
tensor([[-0.6163, -0.6534, -0.0660],
        [ 3.2851,  0.4438,  1.0684],
        [ 2.5171,  0.1937,  2.4790]])

Tensors can be indexed and sliced like numpy arrays or lists.

x = torch.ones(4, 4)
print(f"2nd column of x: \n{x[:, 1]}")

x[:, 1] = 0
print(f"x: \n{x}")
2nd column of x: 
tensor([1., 1., 1., 1.])

x: 
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

To join 2 tensors, use torch.cat or torch.stack

print(f"Concatenating 2 tensors along 1st dimension: \n{torch.cat([x, x], dim=1)}")
Concatenating 2 tensors along 1st dimension: 
tensor([[1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1.]])

Tensors can be run on GPU. Parallel computing allows faster operations than on CPU.

# Check availability of gpu devices
if torch.cuda.is_available():
    x = x.to('cuda')

Note that macOS does not natively support NVIDIA GPUs with CUDA. Latest Apple Silicon devices use Metal for GPU acceleration, which is supported by Pytorch, but only in specific version.

# Check availability of gpu devices
if torch.cuda.is_available():
    x = x.to('cuda')
tensor = torch.randn(3, 3)

print(f"CUDA is available: {torch.cuda.is_available()}")
print(f"MPS is available: {torch.backends.mps.is_available()}")

if torch.backends.mps.is_available():
    mps_device = torch.device('mps')
    tensor = tensor.to(mps_device)
    print(f"MPS device tensor: \n{tensor}")
CUDA is available: False
MPS is available: True

MPS device tensor: 
tensor([[ 1.9530,  0.2365,  0.0942],
        [ 2.0012,  0.1181,  1.2998],
        [-0.6547, -0.0198,  0.4644]], device='mps:0')

Link with NumPy arrays

Tensors and Numpy's ndarrays are pretty much the same. They often share the same underlying memory on CPU, allow users to transfer data from one to another without having the need of creating new variables.

a = torch.ones(5)
print(f"a: \n{a}")
print(f"a is {type(a)}")

b = a.numpy()
print(f"b is {type(b)}")
a: 
tensor([1., 1., 1., 1., 1.])
a is <class 'torch.Tensor'>
b is <class 'numpy.ndarray'>

This prouves that torch tensor and numpy array share the same underlying memory on CPU. Any operation on a will change b on the same manner, and vice-versa.

a.add_(1)
print(f"a: \n{a}")
print(f"b: \n{b}")
a: 
tensor([2., 2., 2., 2., 2.])

b: 
[2. 2. 2. 2. 2.]

To convert numpy array to tensors:

a = np.ones(5)
b = torch.from_numpy(a)

print(f"a: \n{a}")
print(f"b: \n{b}")
print(type(a))
print(type(b))
a: 
[1. 1. 1. 1. 1.]
b: 
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
<class 'numpy.ndarray'>
<class 'torch.Tensor'>