File and Directory Comparisons in Python

PythonProgrammingServer Side Programming

Python’s standard library has filecmp module that defines functions for comparison of files and directories. This comparison takes into consideration the properties of files in addition to data in them.

Example codes in this article use following file and directory structure.

Two directories dir1 and dir2 are first created under current working directory. They contain following files.

This is a file in dir1
Hello Python
Python Standard Library
Hello Python
Python Library

Let us now describe various comparison functions in filecmp module.

filecmp.cmp(f1, f2, shallow=True)

This function compares the two files and returns True if they are identical, False otherwise. The shallow parameter is True by default. Hence the file metadata is considered for comparison in addition to contents. If shallow is set to False, only the contents are compared.

Based on our file structure, following code yields the output as shown −

Differing files : ['file2.txt']
>>> filecmp.cmp('dir1/file1.txt', 'dir2/file1.txt')
>>> filecmp.cmp('dir1/file1.txt', 'dir2/file1.txt', shallow = False)
>>> filecmp.cmp('dir1/file2.txt', 'dir2/file2.txt')

filecmp.cmpfiles(dir1, dir2, shallow)

This function makes comparison of files in two directories and returns a three item tuple. First item in the tuple is list of matched files, second shows list of unmatched files, and third one is the list of common files.

>>> match, mismatch,errors = filecmp.cmpfiles('dir1','dir2',['file1.txt', 'file2.txt'])
>>> match
>>> mismatch
>>> errors

The filecmp module also defines dircmp class. Its object is directory comparison object. It compares files in two directories, identified as left and right directories. The object can execute various methods as described below −


This is the constructor. a and b are directories to be compared. By default system files in the directories are hidden and ignored in comparison.

>>> result = filecmp.dircmp('dir1', 'dir2')

Other methods in dircmp class are as follows −


This method prints result of comparison between directories.

>>> result = filecmp.dircmp('dir1', 'dir2')
diff dir1 dir2
Only in dir1 : ['newfile.txt']
Identical files : ['file1.txt']
Differing files : ['file2.txt']

left, right

These properties print names of first and second directories in dircmp constructor

>>> result.left
>>> result.right

left_list, right_list

These attributes return list of files in both directories

>>> result.left_list
['file1.txt', 'file2.txt', 'newfile.txt']
>>> result.right_list
['file1.txt', 'file2.txt']

common, common_files, common_dirs

These attributes return common files and directories, common files only and common directories only.

>>> result.common
['file1.txt', 'file2.txt']
>>> result.common_files
['file1.txt', 'file2.txt']
>>> result.common_dirs

same_file, diff_files

The attributes return list of identical files and different filesusing comparison operator defined in dircmp class.

>>> result.same_files
>>> result.diff_files

This article discusses dircmp class, its methods and file comparison functions defined in filecmp module.

Updated on 25-Jun-2020 13:34:45