
- Computer Vision - Home
- Computer Vision - Introduction
- Computer Vision - Fundamentals of Image Processing
- Computer Vision - Image Segmentation
- Computer Vision - Image Preprocessing Techniques
- Computer Vision - Feature Detection and Extraction
- Computer Vision - Object Detection
- Computer Vision - Image Classification
- Computer Vision - Image Recognition and Matching
- Computer Vision Useful Resources
- Computer Vision - Useful Resources
- Computer Vision - Discussion
Computer Vision - Image Segmentation
What is Image Segmentation?
Image segmentation is the process of partitioning an image into multiple segments to make the image easier to analyze. Each segment or region usually corresponds to a different object or a part of an object.
By segmenting an image, we can focus on specific areas, making it easier to identify and analyze objects within the image.
Importance of Image Segmentation
Image segmentation is imporant for various applications in computer vision, such as −
- Object Detection: Identifying and locating objects within an image.
- Medical Imaging: Analyzing medical images to detect and diagnose diseases.
- Autonomous Driving: Understanding the surroundings by identifying roads, vehicles, pedestrians, etc.
- Image Editing: Selecting and manipulating specific parts of an image.
Types of Image Segmentation
There are several types of image segmentation, they are −
- Thresholding
- Edge-Based Segmentation
- Region-Based Segmentation
- Clustering-Based Segmentation
- Deep Learning-Based Segmentation
Thresholding
Thresholding is one of the simplest segmentation techniques. It converts a grayscale image into a binary image by setting a threshold value.
Pixels with intensity values above the threshold are assigned one value (e.g., white), and those below the threshold are assigned another value (e.g., black).
We can apply thresholding using the following ways −
- Global Thresholding: Uses a single threshold value for the entire image.
import cv2 # Load image in grayscale image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE) _, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
adaptive_binary_image = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
Edge-Based Segmentation
Edge-based segmentation detects the edges within an image and uses them to define the boundaries of segments. This method relies on detecting significant changes in intensity values.
The commonly used edge detection method is Canny Edge Detector. It is a multi-stage algorithm that detects a wide range of edges.
edges = cv2.Canny(image, 100, 200)
Region-Based Segmentation
Region-based segmentation groups pixels into regions based on predefined criteria, such as intensity values or texture. The idea is to group neighboring pixels with similar properties into the same region.
The common region based methods are as shown below −
- Region Growing: Starts with seed points and grows regions by adding neighboring pixels that meet certain criteria.
- Watershed Algorithm: Treats the image as a topographic surface and finds the watershed lines to segment the regions.
Following is an example on how to apply region based segmentation −
# Apply Watershed Algorithm import numpy as np # Convert image to binary _, binary_image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) # Perform distance transform distance_transform = cv2.distanceTransform(binary_image, cv2.DIST_L2, 5) _, foreground = cv2.threshold(distance_transform, 0.7 * distance_transform.max(), 255, 0) foreground = np.uint8(foreground) unknown = cv2.subtract(binary_image, foreground) # Marker labeling _, markers = cv2.connectedComponents(foreground) markers = markers + 1 markers[unknown == 255] = 0 # Apply watershed markers = cv2.watershed(image, markers) image[markers == -1] = [255, 0, 0]
Clustering-Based Segmentation
Clustering-based segmentation groups pixels into clusters based on their similarity. K-means clustering is a popular method for this type of segmentation.
Following is an example on how to apply K-means clustering −
import numpy as np # Convert image to float32 image = np.float32(image) # Define criteria and apply K-means criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0) _, labels, centers = cv2.kmeans(image, 3, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS) # Convert back to uint8 and reshape the image centers = np.uint8(centers) segmented_image = centers[labels.flatten()] segmented_image = segmented_image.reshape((image.shape))
Deep Learning-Based Segmentation
Deep learning-based segmentation uses convolutional neural networks (CNNs) to segment images. This method is very accurate and can handle complex images.
Following are the common deep learning models −
- Fully Convolutional Networks (FCNs): Replace fully connected layers with convolutional layers to produce segmentation maps.
- U-Net: A popular model for biomedical image segmentation that uses an encoder-decoder architecture.
Following is an example on how to apply deep learning based segmentation −
from keras.models import Model from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, concatenate inputs = Input((image_height, image_width, 1)) conv1 = Conv2D(64, 3, activation='relu', padding='same')(inputs) pool1 = MaxPooling2D(pool_size=(2, 2))(conv1) conv2 = Conv2D(128, 3, activation='relu', padding='same')(pool1) up1 = concatenate([UpSampling2D(size=(2, 2))(conv2), conv1], axis=3) conv3 = Conv2D(64, 3, activation='relu', padding='same')(up1) outputs = Conv2D(1, 1, activation='sigmoid')(conv3) model = Model(inputs=[inputs], outputs=[outputs])