Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find out the dot product of two sparse vectors in Python
A sparse vector is a vector where most elements are zero. The dot product of two vectors is calculated by multiplying corresponding elements and summing the results. For sparse vectors, we can optimize by skipping zero elements.
Problem Statement
Given two sparse vectors represented as lists, we need to calculate their dot product efficiently. The vectors are stored as objects with a nums attribute containing the list elements.
Example
If vector1 = [1, 0, 0, 0, 1] and vector2 = [0, 0, 0, 1, 1], the dot product is ?
1×0 + 0×0 + 0×0 + 0×1 + 1×1 = 1
Algorithm
To solve this efficiently, we follow these steps ?
- Initialize result to 0
- For each index and value in the second vector ?
- Skip if the current value is 0
- Skip if the corresponding value in the first vector is 0
- Otherwise, multiply the values and add to result
- Return the final result
Implementation
class SparseVector:
def __init__(self, nums):
self.nums = nums
def dotProduct(self, other_vector):
result = 0
for i, value in enumerate(other_vector.nums):
if value == 0:
continue
elif self.nums[i] == 0:
continue
else:
result += value * self.nums[i]
return result
# Create sparse vectors
vector1 = SparseVector([1, 0, 0, 0, 1])
vector2 = SparseVector([0, 0, 0, 1, 1])
# Calculate dot product
dot_product = vector1.dotProduct(vector2)
print("Dot product:", dot_product)
Dot product: 1
Optimized Approach Using Dictionary
For very sparse vectors, we can use a dictionary to store only non-zero elements ?
class OptimizedSparseVector:
def __init__(self, nums):
# Store only non-zero elements with their indices
self.non_zero = {i: val for i, val in enumerate(nums) if val != 0}
def dotProduct(self, other_vector):
result = 0
# Iterate through the smaller dictionary for efficiency
smaller_dict = self.non_zero if len(self.non_zero) <= len(other_vector.non_zero) else other_vector.non_zero
larger_dict = other_vector.non_zero if smaller_dict is self.non_zero else self.non_zero
for index, value in smaller_dict.items():
if index in larger_dict:
result += value * larger_dict[index]
return result
# Test with the same vectors
vector1 = OptimizedSparseVector([1, 0, 0, 0, 1])
vector2 = OptimizedSparseVector([0, 0, 0, 1, 1])
dot_product = vector1.dotProduct(vector2)
print("Optimized dot product:", dot_product)
print("Vector1 non-zero elements:", vector1.non_zero)
print("Vector2 non-zero elements:", vector2.non_zero)
Optimized dot product: 1
Vector1 non-zero elements: {0: 1, 4: 1}
Vector2 non-zero elements: {3: 1, 4: 1}
Performance Comparison
| Approach | Time Complexity | Space Complexity | Best For |
|---|---|---|---|
| Basic Method | O(n) | O(1) | Dense or moderately sparse vectors |
| Dictionary Method | O(min(k1, k2)) | O(k1 + k2) | Very sparse vectors |
Where n is the vector length, k1 and k2 are the number of non-zero elements in each vector.
Conclusion
The basic approach works well for most cases, while the dictionary-based method excels with very sparse vectors. Choose based on your data's sparsity level for optimal performance.
