Python Helpers for Computing Deltas


The difflib module is used in Python to compute deltas. It is used to compare files, and can produce information about file differences in various formats, including HTML and context and unified diffs.

We need to first import the difflib module before using it −

import difflib

Class (difflib.SequenceMatcher)

This class is used to compare two sequences of any type. It has different methods. Some of the methods −

  • set_seqs(a, b) − Set the sequence files which will be compared. It computes and caches detailed information about the second file. So for matching multiple files, we should set the first sequence repeatedly.

  • set_seq1(a) − Set the first sequence which will be compared.

  • set_seq2(2) − Set the second sequence which will be compared.

  • find_longest_match(alo, ahi, blo, bhi) − Find which matching block is longest in the range alo to ahi for first sequence and blo to bhi for second sequence.

  • get_matching_blocks() − Find the list of matching sequences in descending order.

  • ratio() − Find ration of the sequences similarity as a float value.

Return a measure of the sequences’ similarity

To return a measure of the sequences similarity, use the ratio() method of the difflib module −

Example

import difflib s = difflib.SequenceMatcher(None, "abcd", "bcde") print("Ratio = ",s.ratio())

Output

Ratio = 0.75

Return an upper bound on ratio

To return an upper bound on ratio, run the following code −

Example

import difflib s = difflib.SequenceMatcher(None, "abcd", "bcde") print("Ratio = ",s.ratio()) print("Quick Ratio = ",s.quick_ratio()) print("Real Quick Ratio = ",s.real_quick_ratio())

Output

Ratio = 0.75
Quick Ratio = 0.75
Real Quick Ratio = 1.0

Get the ratio of the sequence matching

To get the ratio of the sequence matching, here’s the code −

Example

import difflib myStr1 = 'Python Programming' myStr2 = 'Python Standard Library' # The SequenceMatcher compares sequences seq_match = difflib.SequenceMatcher(lambda x: x==' ', myStr1, myStr2) print("Ratio of the sequence matching = " + str(round(seq_match.ratio(), 3))) for match_block in seq_match.get_matching_blocks(): print(match_block)

Output

The ratio of the sequence matching is: 0.488
Match(a=0, b=0, size=7)
Match(a=8, b=13, size=1)
Match(a=11, b=19, size=2)
Match(a=18, b=23, size=0)

Updated on: 11-Aug-2022

159 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements