Computer Vision - Image Segmentation

Quiz

What is Image Segmentation?

Image segmentation is the process of partitioning an image into multiple segments to make the image easier to analyze. Each segment or region usually corresponds to a different object or a part of an object.

By segmenting an image, we can focus on specific areas, making it easier to identify and analyze objects within the image.

Importance of Image Segmentation

Image segmentation is imporant for various applications in computer vision, such as −

Object Detection: Identifying and locating objects within an image.
Medical Imaging: Analyzing medical images to detect and diagnose diseases.
Autonomous Driving: Understanding the surroundings by identifying roads, vehicles, pedestrians, etc.
Image Editing: Selecting and manipulating specific parts of an image.

Types of Image Segmentation

There are several types of image segmentation, they are −

Thresholding
Edge-Based Segmentation
Region-Based Segmentation
Clustering-Based Segmentation
Deep Learning-Based Segmentation

Thresholding

Thresholding is one of the simplest segmentation techniques. It converts a grayscale image into a binary image by setting a threshold value.

Pixels with intensity values above the threshold are assigned one value (e.g., white), and those below the threshold are assigned another value (e.g., black).

We can apply thresholding using the following ways −

Global Thresholding: Uses a single threshold value for the entire image.

import cv2

# Load image in grayscale
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
_, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

Adaptive Thresholding: Uses different threshold values for different regions of the image.

adaptive_binary_image = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

Edge-Based Segmentation

Edge-based segmentation detects the edges within an image and uses them to define the boundaries of segments. This method relies on detecting significant changes in intensity values.

The commonly used edge detection method is Canny Edge Detector. It is a multi-stage algorithm that detects a wide range of edges.

edges = cv2.Canny(image, 100, 200)

Region-Based Segmentation

Region-based segmentation groups pixels into regions based on predefined criteria, such as intensity values or texture. The idea is to group neighboring pixels with similar properties into the same region.

The common region based methods are as shown below −

Region Growing: Starts with seed points and grows regions by adding neighboring pixels that meet certain criteria.
Watershed Algorithm: Treats the image as a topographic surface and finds the watershed lines to segment the regions.

Following is an example on how to apply region based segmentation −

# Apply Watershed Algorithm
import numpy as np

# Convert image to binary
_, binary_image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# Perform distance transform
distance_transform = cv2.distanceTransform(binary_image, cv2.DIST_L2, 5)
_, foreground = cv2.threshold(distance_transform, 0.7 * distance_transform.max(), 255, 0)
foreground = np.uint8(foreground)
unknown = cv2.subtract(binary_image, foreground)
# Marker labeling
_, markers = cv2.connectedComponents(foreground)
markers = markers + 1
markers[unknown == 255] = 0
# Apply watershed
markers = cv2.watershed(image, markers)
image[markers == -1] = [255, 0, 0]

Clustering-Based Segmentation

Clustering-based segmentation groups pixels into clusters based on their similarity. K-means clustering is a popular method for this type of segmentation.

Following is an example on how to apply K-means clustering −

import numpy as np

# Convert image to float32
image = np.float32(image)
# Define criteria and apply K-means
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
_, labels, centers = cv2.kmeans(image, 3, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
# Convert back to uint8 and reshape the image
centers = np.uint8(centers)
segmented_image = centers[labels.flatten()]
segmented_image = segmented_image.reshape((image.shape))

Deep Learning-Based Segmentation

Deep learning-based segmentation uses convolutional neural networks (CNNs) to segment images. This method is very accurate and can handle complex images.

Following are the common deep learning models −

Fully Convolutional Networks (FCNs): Replace fully connected layers with convolutional layers to produce segmentation maps.
U-Net: A popular model for biomedical image segmentation that uses an encoder-decoder architecture.

Following is an example on how to apply deep learning based segmentation −

from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, concatenate

inputs = Input((image_height, image_width, 1))
conv1 = Conv2D(64, 3, activation='relu', padding='same')(inputs)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation='relu', padding='same')(pool1)
up1 = concatenate([UpSampling2D(size=(2, 2))(conv2), conv1], axis=3)
conv3 = Conv2D(64, 3, activation='relu', padding='same')(up1)
outputs = Conv2D(1, 1, activation='sigmoid')(conv3)
model = Model(inputs=[inputs], outputs=[outputs])

Print Page