Python - Pairwise distances of n-dimensional Space Array

Pairwise distance calculation is used in various domains including data analysis, machine learning, and image processing. We can calculate the pairwise distance between every pair of elements in each dataset. In this article, we will explore various methods to calculate pairwise distances in Python for arrays representing data in multiple dimensions.

What is Pairwise Distance?

Pairwise distance refers to calculating the distance between each pair of points in an n-dimensional space. You can choose different distance metrics according to the type of data and problem requirements.

Common distance metrics include ?

  • Euclidean distance Measures the straight-line distance between points

  • Manhattan distance Sum of absolute differences along each dimension

  • Minkowski distance Generalizes both Euclidean and Manhattan distances

These metrics help identify similarity or dissimilarity between data points based on your specific problem.

Using SciPy's cdist Function

SciPy's cdist function provides an efficient way to calculate pairwise distances using various metrics ?

import numpy as np
from scipy.spatial.distance import cdist

pts = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
dist = cdist(pts, pts, metric='euclidean')
print("Euclidean distances:")
print(dist)
Euclidean distances:
[[ 0.          5.19615242 10.39230485]
 [ 5.19615242  0.          5.19615242]
 [10.39230485  5.19615242  0.        ]]

The result shows a symmetric matrix where diagonal elements are zero (distance from a point to itself) and off-diagonal elements show distances between different points.

Using Scikit-learn's pairwise_distances

Scikit-learn provides the pairwise_distances function with support for multiple distance metrics ?

from sklearn.metrics import pairwise_distances
import numpy as np

pts = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
manhattan_dist = pairwise_distances(pts, metric='manhattan')
print("Manhattan distances:")
print(manhattan_dist)
Manhattan distances:
[[ 0.  9. 18.]
 [ 9.  0.  9.]
 [18.  9.  0.]]

Using pdist Function

The pdist function from SciPy returns a condensed distance matrix (upper triangular) ?

from scipy.spatial.distance import pdist, squareform
import numpy as np

pts = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Get condensed distance matrix
condensed_dist = pdist(pts, metric='euclidean')
print("Condensed distances:")
print(condensed_dist)

# Convert to square matrix
square_dist = squareform(condensed_dist)
print("\nSquare distance matrix:")
print(square_dist)
Condensed distances:
[ 5.19615242  10.39230485   5.19615242]

Square distance matrix:
[[ 0.          5.19615242 10.39230485]
 [ 5.19615242  0.          5.19615242]
 [10.39230485  5.19615242  0.        ]]

Using NearestNeighbors Class

The NearestNeighbors class can calculate distances to nearest neighbors ?

from sklearn.neighbors import NearestNeighbors
import numpy as np

pts = np.array([[1, 2], [4, 5], [7, 8]])
nbrs = NearestNeighbors(n_neighbors=len(pts)).fit(pts)
distances, indices = nbrs.kneighbors(pts)

print("Distances to nearest neighbors:")
print(distances)
print("\nIndices of nearest neighbors:")
print(indices)
Distances to nearest neighbors:
[[0.         4.24264069 8.48528137]
 [0.         4.24264069 4.24264069]
 [0.         4.24264069 8.48528137]]

Indices of nearest neighbors:
[[0 1 2]
 [1 0 2]
 [2 1 0]]

Comparison of Methods

Method Function Output Format Best For
SciPy cdist cdist() Full matrix Different point sets
SciPy pdist pdist() Condensed array Memory efficiency
Scikit-learn pairwise_distances() Full matrix ML workflows
NearestNeighbors kneighbors() K nearest only Finding neighbors

Conclusion

Use cdist() for comparing two different datasets, pdist() for memory-efficient single dataset analysis, and pairwise_distances() for machine learning workflows. Choose the method that best fits your computational needs and output format requirements.

Updated on: 2026-03-27T15:09:57+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements