SequenceMatcher in Python for Longest Common Substring.

The SequenceMatcher class is part of Python's difflib module. It compares sequences (such as lists or strings) and finds similarities between them.

The task is to find the Longest Common Substring ? the longest sequence of characters that appears contiguously in both strings. This is different from the Longest Common Subsequence, where characters may appear in the same order but not necessarily contiguous.

Using find_longest_match() Method

The find_longest_match() method finds the longest matching sequence of elements between two sequences. It returns a Match object with three attributes: a (start position in first sequence), b (start position in second sequence), and size (length of the match).

Syntax

SequenceMatcher.find_longest_match(alo, ahi, blo, bhi)

Parameters:

  • alo, ahi ? Range in the first sequence
  • blo, bhi ? Range in the second sequence

Example 1: Basic String Matching

Find the longest common substring between "abcde" and "abghf" ?

from difflib import SequenceMatcher

x = "abcde"
y = "abghf"
matcher = SequenceMatcher(None, x, y)
result = matcher.find_longest_match(0, len(x), 0, len(y))
print("Result:", x[result.a : result.a + result.size])
print("Match details: start={}, size={}".format(result.a, result.size))
Result: ab
Match details: start=0, size=2

Example 2: No Common Substring

When there's no common substring, the result will be an empty string ?

from difflib import SequenceMatcher

x = "xyz"
y = "efg"
matcher = SequenceMatcher(None, x, y)
result = matcher.find_longest_match(0, len(x), 0, len(y))
match = x[result.a : result.a + result.size]
print("Result: '{}'".format(match))
print("Size:", result.size)
Result: ''
Size: 0

Example 3: Case-Sensitive Matching

SequenceMatcher is case-sensitive by default. Finding the longest common substring between 'Welcome' and 'weLCome' ?

from difflib import SequenceMatcher

x = "Welcome"
y = "weLCome"
matcher = SequenceMatcher(None, x, y)
result = matcher.find_longest_match(0, len(x), 0, len(y))
print("Result:", x[result.a : result.a + result.size])
print("Position in x: {}, Position in y: {}".format(result.a, result.b))
Result: ome
Position in x: 4, Position in y: 4

Example 4: Practical Function

Creating a reusable function to find longest common substring ?

from difflib import SequenceMatcher

def longest_common_substring(str1, str2):
    matcher = SequenceMatcher(None, str1, str2)
    result = matcher.find_longest_match(0, len(str1), 0, len(str2))
    return str1[result.a : result.a + result.size]

# Test with different examples
examples = [
    ("programming", "graming"),
    ("hello world", "yellow"),
    ("python", "java")
]

for s1, s2 in examples:
    lcs = longest_common_substring(s1, s2)
    print(f"'{s1}' & '{s2}' ? '{lcs}'")
'programming' & 'graming' ? 'graming'
'hello world' & 'yellow' ? 'ello'
'python' & 'java' ? ''

Key Points

Feature Description
Case Sensitivity Default behavior is case-sensitive
Return Type Match object with a, b, and size attributes
Empty Result Returns size=0 when no common substring exists
Contiguous Finds consecutive characters only

Conclusion

SequenceMatcher's find_longest_match() method efficiently finds the longest common substring between two sequences. It's case-sensitive by default and returns detailed position information along with the match size.

Updated on: 2026-03-24T20:58:34+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements