Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Minkowski distance in Python
The Minkowski distance is a metric in a normed vector space that measures the distance between two or more vectors. This metric is widely used in machine learning algorithms for measuring similarity between data points. The formula for calculating Minkowski distance is:
Where:
- ri and si: Corresponding elements of two vectors in n-dimensional space
- p: Order parameter that determines the distance type. When p=1, it becomes Manhattan distance; when p=2, it represents Euclidean distance
- n: Number of dimensions
Example Calculation
For vectors x = (0, 3, 1, 4) and y = (2, 9, 3, 7) with p = 5, the Minkowski distance is approximately 6.047.
Method 1: Using Custom Implementation
We can implement Minkowski distance using Python's built-in math functions ?
from math import pow, fabs
def minkowski_distance(x, y, p):
"""Calculate Minkowski distance between two vectors"""
if len(x) != len(y):
raise ValueError("Vectors must have the same length")
# Calculate sum of |xi - yi|^p
distance_sum = sum(pow(fabs(a - b), p) for a, b in zip(x, y))
# Return pth root
return pow(distance_sum, 1/p)
# Example vectors
x = (0, 3, 1, 4)
y = (2, 9, 3, 7)
p = 5
result = minkowski_distance(x, y, p)
print(f"Minkowski distance: {result:.3f}")
Minkowski distance: 6.047
Method 2: Using SciPy
SciPy provides a built-in minkowski() function in the scipy.spatial.distance module ?
from scipy.spatial import distance
# Define vectors
x = (0, 3, 1, 4)
y = (2, 9, 3, 7)
# Calculate Minkowski distance with p=5
minkowski_dist = distance.minkowski(x, y, p=5)
print(f"Minkowski distance (p=5): {minkowski_dist:.3f}")
Minkowski distance (p=5): 6.047
Special Cases of Minkowski Distance
Different values of p produce well-known distance metrics ?
from scipy.spatial import distance
x = (0, 3, 1, 4)
y = (2, 9, 3, 7)
# Manhattan distance (p=1)
manhattan = distance.minkowski(x, y, p=1)
print(f"Manhattan distance (p=1): {manhattan}")
# Euclidean distance (p=2)
euclidean = distance.minkowski(x, y, p=2)
print(f"Euclidean distance (p=2): {euclidean:.3f}")
# Higher order distance (p=3)
higher_order = distance.minkowski(x, y, p=3)
print(f"Minkowski distance (p=3): {higher_order:.3f}")
Manhattan distance (p=1): 13.0 Euclidean distance (p=2): 7.280 Minkowski distance (p=3): 6.498
Comparison of Methods
| Method | Pros | Cons | Best For |
|---|---|---|---|
| Custom Implementation | No external dependencies | More code to maintain | Learning purposes |
| SciPy | Optimized, reliable | Requires SciPy installation | Production applications |
Conclusion
Minkowski distance is a versatile metric that generalizes Manhattan and Euclidean distances. Use SciPy's implementation for production code, as it's optimized and well-tested. The parameter p controls the sensitivity to outliers ? higher values make the distance less sensitive to individual large differences.
