Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to convert an image to a PyTorch Tensor?
PyTorch tensors are n-dimensional arrays that can leverage GPU acceleration for faster computations. Converting images to tensors is essential for deep learning tasks in PyTorch, as it allows the framework to process image data efficiently on both CPU and GPU.
To convert an image to a PyTorch tensor, we use transforms.ToTensor() which automatically handles scaling pixel values from [0, 255] to [0, 1] and changes the dimension order from HxWxC (Height x Width x Channels) to CxHxW (Channels x Height x Width).
Method 1: Converting PIL Images
The most common approach is using PIL (Python Imaging Library) to read images and convert them to tensors ?
import torch
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
# Create a sample RGB image (since we can't load external files online)
# This simulates reading an image file
sample_array = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
image = Image.fromarray(sample_array)
# Define transform to convert image to tensor
transform = transforms.ToTensor()
# Convert the image to PyTorch tensor
tensor = transform(image)
print("Original image shape:", np.array(image).shape)
print("Tensor shape:", tensor.shape)
print("Tensor data type:", tensor.dtype)
print("Tensor value range:", tensor.min().item(), "to", tensor.max().item())
Original image shape: (100, 100, 3) Tensor shape: torch.Size([3, 100, 100]) Tensor data type: torch.float32 Tensor value range: 0.0 to 1.0
Method 2: Converting NumPy Arrays
You can also convert NumPy arrays directly to tensors, which is useful when working with OpenCV or other libraries ?
import torch
import torchvision.transforms as transforms
import numpy as np
# Create a sample image as numpy array (HxWxC format)
image_array = np.random.randint(0, 256, (150, 200, 3), dtype=np.uint8)
# Define transform
transform = transforms.ToTensor()
# Convert numpy array to tensor
tensor = transform(image_array)
print("NumPy array shape:", image_array.shape)
print("Tensor shape:", tensor.shape)
print("Data type conversion:", image_array.dtype, "?", tensor.dtype)
NumPy array shape: (150, 200, 3) Tensor shape: torch.Size([3, 150, 200]) Data type conversion: uint8 ? torch.float32
Understanding the Transform Process
The ToTensor() transform performs two key operations ?
import torch
import torchvision.transforms as transforms
import numpy as np
# Create a small sample image to see the transformation clearly
small_image = np.array([[[255, 128, 0], [200, 100, 50]],
[[150, 75, 25], [100, 50, 10]]], dtype=np.uint8)
print("Original image shape (HxWxC):", small_image.shape)
print("Original pixel values:\n", small_image)
# Convert to tensor
transform = transforms.ToTensor()
tensor = transform(small_image)
print("\nTensor shape (CxHxW):", tensor.shape)
print("Normalized values (0-1 range):")
print("Red channel:\n", tensor[0])
print("Green channel:\n", tensor[1])
print("Blue channel:\n", tensor[2])
Original image shape (HxWxC): (2, 2, 3)
Original pixel values:
[[[255 128 0]
[200 100 50]]
[[150 75 25]
[100 50 10]]]
Tensor shape (CxHxW): torch.Size([3, 2, 2])
Normalized values (0-1 range):
Red channel:
tensor([[1.0000, 0.7843],
[0.5882, 0.3922]])
Green channel:
tensor([[0.5020, 0.3922],
[0.2941, 0.1961]])
Blue channel:
tensor([[0.0000, 0.1961],
[0.0980, 0.0392]])
Key Differences
| Aspect | Original Image | PyTorch Tensor |
|---|---|---|
| Shape Format | H × W × C | C × H × W |
| Value Range | [0, 255] | [0, 1] |
| Data Type | uint8 | float32 |
| GPU Support | No | Yes |
Conclusion
Converting images to PyTorch tensors is straightforward using transforms.ToTensor(). This transform automatically normalizes pixel values to [0, 1] and reorders dimensions to the PyTorch standard (C×H×W), making your images ready for deep learning models.
