How to compare two different files line by line in Python?

Comparing two files line by line is a common task in Python programming. This tutorial explores different methods to compare files, from basic line-by-line comparison to using specialized modules like filecmp and difflib.

Basic Line-by-Line Comparison

The simplest approach uses the open() function to read both files and compare them manually. This method gives you full control over the comparison logic.

Example

Here's how to compare two files and identify differences ?

# Create sample files for demonstration
with open('file1.txt', 'w') as f1:
    f1.write("Line 1\nLine 2\nLine 3\nLine 4")

with open('file2.txt', 'w') as f2:
    f2.write("Line 1\nDifferent Line 2\nLine 3\nLine 5")

# Compare files line by line
with open('file1.txt', 'r') as file1, open('file2.txt', 'r') as file2:
    lines1 = file1.readlines()
    lines2 = file2.readlines()
    
    max_lines = max(len(lines1), len(lines2))
    
    for i in range(max_lines):
        line1 = lines1[i].strip() if i < len(lines1) else ""
        line2 = lines2[i].strip() if i < len(lines2) else ""
        
        if line1 != line2:
            print(f"Line {i+1} doesn't match:")
            print(f"File1: {line1}")
            print(f"File2: {line2}")
            print("-" * 30)
Line 2 doesn't match:
File1: Line 2
File2: Different Line 2
------------------------------
Line 4 doesn't match:
File1: Line 4
File2: Line 5
------------------------------

Using the filecmp Module

The filecmp module provides a quick way to check if two files are identical. The filecmp.cmp() function returns True if files match, False otherwise.

Example

Here's how to use filecmp for file comparison ?

import filecmp

# Create test files
with open('identical1.txt', 'w') as f:
    f.write("Same content\nSecond line")

with open('identical2.txt', 'w') as f:
    f.write("Same content\nSecond line")

with open('different.txt', 'w') as f:
    f.write("Different content\nSecond line")

def compare_files(file1_path, file2_path):
    result = filecmp.cmp(file1_path, file2_path)
    
    if result:
        print(f"{file1_path} and {file2_path} are identical.")
    else:
        print(f"{file1_path} and {file2_path} are different.")

# Test comparisons
compare_files('identical1.txt', 'identical2.txt')
compare_files('identical1.txt', 'different.txt')
identical1.txt and identical2.txt are identical.
identical1.txt and different.txt are different.

Using the difflib Module

The difflib module offers advanced text comparison features, providing detailed information about differences between files.

Using unified_diff()

The unified_diff() function creates a unified diff output similar to Unix diff tools ?

import difflib

# Create sample files
with open('original.txt', 'w') as f:
    f.write("import os\nimport sys\nprint('Hello World')")

with open('modified.txt', 'w') as f:
    f.write("import os\nimport datetime\nprint('Hello Python')")

with open('original.txt', 'r') as file1, open('modified.txt', 'r') as file2:
    file1_lines = file1.readlines()
    file2_lines = file2.readlines()
    
    diff = difflib.unified_diff(
        file1_lines, file2_lines,
        fromfile='original.txt',
        tofile='modified.txt',
        lineterm=''
    )
    
    for line in diff:
        print(line)
--- original.txt
+++ modified.txt
@@ -1,3 +1,3 @@
 import os
-import sys
-print('Hello World')
+import datetime
+print('Hello Python')

Using Differ Class

The Differ class provides a more detailed line-by-line comparison ?

from difflib import Differ

# Create test files
with open('text1.txt', 'w') as f:
    f.write("Python programming\nData analysis\nMachine learning")

with open('text2.txt', 'w') as f:
    f.write("Python programming\nWeb development\nMachine learning")

with open('text1.txt', 'r') as file1, open('text2.txt', 'r') as file2:
    differ = Differ()
    
    lines1 = file1.readlines()
    lines2 = file2.readlines()
    
    for line in differ.compare(lines1, lines2):
        if line.startswith('- ') or line.startswith('+ '):
            print(line.strip())
- Data analysis
+ Web development

Comparison of Methods

Method Best For Output Detail Performance
Manual comparison Custom logic Custom format Good
filecmp.cmp() Quick identity check Boolean result Excellent
difflib.unified_diff() Unix-style diffs Unified format Good
difflib.Differ Detailed analysis Line-by-line details Moderate

Conclusion

Python offers multiple approaches for file comparison. Use filecmp for quick identity checks, difflib for detailed difference analysis, and manual comparison for custom requirements. Choose the method based on your specific needs and performance requirements.

Updated on: 2026-03-24T18:31:19+05:30

12K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements