- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# Calculating the Hamming distance using SciPy

Hamming distance calculates the distance between two binary vectors. Mostly we find the binary strings when we use one-hot encoding on categorical columns of data. In one-hot encoding the integer variable is removed and a new binary variable will be added for each unique integer value. For example, if a column had the categories say ‘Length’, ‘Width’, and ‘Breadth’. We might one-hot encode each example as a bitstring with one bit for each column as follows −

Length = [1, 0, 0]

Width = [0, 1, 0]

Breadth = [0, 0, 1]

The Hamming distance between any of the two categories mentioned above, can be calculated as the sum or average number of bit differences between the two binary strings. We can see that the Hamming difference between Length and Width categories is about 2/3 or 0.666 because 2 out of 3 positions are different.

Hamming distance will also decide the similarity between categorical variables. For example, suppose we have two strings −

**“ Google”** and

**“**

*Goagle*”Both the strings are of same length hence we can calculate the Hamming distance between them. We will start with matching characters one by one. The first and second characters in both the strings are the same. The third character is different but the rest of all the characters are also the same hence the Hamming distance between the above strings is 1.

The Hamming distance only works with the same length strings. The larger the Hamming distance between strings, more dissimilar will be the strings and vice versa.

Let’s see how we can calculate the Hamming distance of two strings using SciPy library −

## Example

# Importing the SciPy library from scipy.spatial import distance # Defining the strings A = 'Google' B = 'Goagle' A, B # Computing the Hamming distance hamming_distance = distance.hamming(list(A), list(B))*len(A) print('Hamming Distance b/w', A, 'and', B, 'is: ', hamming_distance)

## Output

Hamming Distance b/w Google and Goagle is: 1.0

- Related Articles
- Calculating the Manhattan distance using SciPy
- Calculating the Minkowski distance using SciPy
- Calculating Euclidean distance using SciPy
- What is Hamming Distance?
- Hamming Distance in Python
- Total Hamming Distance in C++
- Hamming Distance between two strings in JavaScript
- Finding hamming distance in a string in JavaScript
- Program to minimize hamming distance after swap operations in Python
- Calculating profit or loss using Python
- Explain the Hamming Codes in Error Correction
- Error Correcting Codes - Hamming codes
- Hamming Code in Computer Networks
- Finding determinant of a square matrix using SciPy library
- Finding inverse of a square matrix using SciPy library