Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to compare files in Python
Python's filecmp module provides efficient methods to compare files and directories. It offers three main functions: cmp() for comparing individual files, cmpfiles() for comparing multiple files, and dircmp() for comprehensive directory comparison.
Basic File Comparison with cmp()
The filecmp.cmp() function compares two files and returns True if they are identical, False otherwise ?
import filecmp
import os
# Create test files
with open('file1.txt', 'w') as f:
f.write('Hello World')
with open('file2.txt', 'w') as f:
f.write('Hello World')
with open('file3.txt', 'w') as f:
f.write('Different content')
# Compare identical files
print("Identical files:", filecmp.cmp('file1.txt', 'file2.txt'))
# Compare different files
print("Different files:", filecmp.cmp('file1.txt', 'file3.txt'))
# Shallow vs deep comparison
print("Shallow comparison:", filecmp.cmp('file1.txt', 'file2.txt', shallow=True))
print("Deep comparison:", filecmp.cmp('file1.txt', 'file2.txt', shallow=False))
Identical files: True Different files: False Shallow comparison: True Deep comparison: True
Parameters
file1, file2 − Path to the files to compare
shallow − If
True(default), only compares metadata; ifFalse, compares actual content
Comparing Multiple Files with cmpfiles()
Use filecmp.cmpfiles() to compare multiple files between two directories ?
import filecmp
import os
# Create test directories
os.makedirs('dir1', exist_ok=True)
os.makedirs('dir2', exist_ok=True)
# Create test files in both directories
with open('dir1/same.txt', 'w') as f:
f.write('Same content')
with open('dir2/same.txt', 'w') as f:
f.write('Same content')
with open('dir1/different.txt', 'w') as f:
f.write('Content A')
with open('dir2/different.txt', 'w') as f:
f.write('Content B')
# Compare files
common_files = ['same.txt', 'different.txt']
match, mismatch, errors = filecmp.cmpfiles('dir1', 'dir2', common_files)
print("Matched files:", match)
print("Mismatched files:", mismatch)
print("Error files:", errors)
Matched files: ['same.txt'] Mismatched files: ['different.txt'] Error files: []
Directory Comparison with dircmp()
The filecmp.dircmp() class provides comprehensive directory comparison capabilities ?
import filecmp
import os
# Create more complex directory structure
os.makedirs('dirA/subdir', exist_ok=True)
os.makedirs('dirB/subdir', exist_ok=True)
# Create various test files
with open('dirA/common.txt', 'w') as f:
f.write('Same content')
with open('dirB/common.txt', 'w') as f:
f.write('Same content')
with open('dirA/only_in_A.txt', 'w') as f:
f.write('Only in A')
with open('dirB/only_in_B.txt', 'w') as f:
f.write('Only in B')
# Compare directories
dc = filecmp.dircmp('dirA', 'dirB')
print("Files only in dirA:", dc.left_only)
print("Files only in dirB:", dc.right_only)
print("Common files:", dc.common_files)
print("Identical files:", dc.same_files)
print("Different files:", dc.diff_files)
Files only in dirA: ['only_in_A.txt'] Files only in dirB: ['only_in_B.txt'] Common files: ['common.txt'] Identical files: ['common.txt'] Different files: []
Detailed Reports
The dircmp object provides methods to generate detailed comparison reports ?
import filecmp
# Using the previously created directories
dc = filecmp.dircmp('dirA', 'dirB')
# Generate a basic report
print("=== Basic Report ===")
dc.report()
print("\n=== Partial Closure Report ===")
dc.report_partial_closure()
=== Basic Report === diff dirA dirB Only in dirA : ['only_in_A.txt'] Only in dirB : ['only_in_B.txt'] Identical files : ['common.txt'] Common subdirectories : ['subdir'] === Partial Closure Report === diff dirA dirB Only in dirA : ['only_in_A.txt'] Only in dirB : ['only_in_B.txt'] Identical files : ['common.txt'] Common subdirectories : ['subdir'] diff dirA/subdir dirB/subdir
Key Methods and Attributes
| Method/Attribute | Description |
|---|---|
left_only |
Files/dirs only in left directory |
right_only |
Files/dirs only in right directory |
common_files |
Files present in both directories |
same_files |
Files with identical content |
diff_files |
Files with different content |
report() |
Basic comparison report |
Conclusion
The filecmp module provides powerful tools for file and directory comparison. Use cmp() for simple file comparisons, cmpfiles() for multiple files, and dircmp() for comprehensive directory analysis with detailed reporting capabilities.
