Work with ZIP archives in Python (zipfile)


The ZIP is one of the most popular file formats used for archiving and compression. It has been in use since the days of MSDOS and PC and has been used by famous PKZIP application.

The zipfile module in Python’s standard library provides classes that facilitate the tools for creating, extracting, reading and writing to ZIP archives.

ZipFile()

This function returns a ZipFile object from a file parameter which can be a string or file object as created by built-in open() function. The function needs a mode parameter whose default value is ‘r’ although it can take ‘w’ or ‘a’ value for opening the archive in read, write or append mode respectively.

The archive by default is uncompressed. To specify the type of compression algorithm to be used, one of the constants has to be assigned to compression parameter.

zipfile.ZIP_STOREDfor an uncompressed archive member.
zipfile.ZIP_DEFLATEDfor the usual ZIP compression method. This requires the zlib module.
zipfile.ZIP_BZIP2for the BZIP2 compression method. This requires the bz2 module.
zipfile.ZIP_LZMAfor the LZMA compression method. This requires the lzma module.

The ZipFile object uses following mehods.

write()

This method given file to the archive represented by ZipFile object.

>>> import zipfile
>>> newzip=zipfile.ZipFile('newdir/newzip.zip','w')
>>> newzip.write('zen.txt')
>>> newzip.close()

Additional file can be added to already existing archive by opening it in append mode (‘a’ as the mode)

>>> newzip=zipfile.ZipFile('newdir/newzip.zip','a')
>>> newzip.write('zen.txt')
>>> newzip.close()

read()

This method reads data from a particular file in the archive.

>>> newzip=zipfile.ZipFile('newdir/newzip.zip','r')
>>> data=newzip.read('json.txt')
>>> data
b'["Rakesh", {"marks": [50, 60, 70]}]'

printdir()

This method lists all file in given archive.

>>> newzip.printdir()
File Name Modified Size
json.txt 2018-11-2717:04:40 35
zen.txt 2018-11-2523:13:44 878

extract()

This method extracts a specified file from archive by default to current directory or to one given as second parameter to it.

>>> newzip.extract('json.txt','newdir')
'newdir\json.txt'

extractall()

This method extracts all files in the archive to current directory by default. Specify alternate directory if required as parameter.

>>> newzip.extractall('newdir')

getinfo()

This method returns ZipInfo object corresponding to the given file. The ZipInfo object contains different metadata information of the file.

Following code obtains ZipInfo object of ‘zen.txt’ from the archive and retrieves filename, size and date-time information from it.

>>> inf = newzip.getinfo('zen.txt')
>>> inf.filename,inf.file_size, inf.date_time
('zen.txt', 878, (2018, 11, 25, 23, 13, 45))

infolist()

This method returns a list of ZipInfo objects of all file in the archive.

>>> newzip.infolist()
[<ZipInfo filename = 'json.txt' filemode='-rw-rw-rw-' file_size=35>, <ZipInfo filename = 'zen.txt' filemode='-rw-rw-rw-' file_size=878>]

As mentioned earlier, compression algorithm to be applied while constructing ZIP archive is specified in compression parameter. In following code, ZIP-DEFLATED constant builds the archive using zlib compression.

>>> zipobj = zipfile.ZipFile('txtzip.zip',mode='w', compression=zipfile.ZIP_DEFLATED)
>>> files=glob.glob("*.txt")
>>> for file in files:
zipobj.write(file)
>>> zipobj.close()

namelist()

This method of ZipFile object returns a list of all files in the archive.

>>> zipobj = zipfile.ZipFile('txtzip.zip',mode='r')
>>> zipobj.namelist()
['a!.txt', 'data().txt', 'dict.txt', 'json.txt', 'LICENSE.txt', 'lines.txt', 'msg.txt', 'NEWS.txt', 'test.txt/', 'zen.txt', 'zen1.txt', 'zenbak.txt']

setpassword()

This method sets password parameter which must be provided at the time of extracting the archive.

PyZipFile()

This function in zipfile module returns PyZipFile object. The PyZipFile object can construct a module containing files with .py extension. This archive can be added in sys.path environment variable so that the module can be imported using zipimport module.

The writepy() method adds a .py file in the archive after compiling it in respective .pyc file.

files=glob.glob("*.py")
>>> pyzipobj = zipfile.PyZipFile('pyfiles/pyzip.zip', mode='w', compression=zipfile.ZIP_LZMA)
>>> for file in files:
pyzipobj.writepy(file)
>>> pyzipobj.close()

In this article classes and functions in zipfile module has been discussed.

Updated on: 25-Jun-2020

702 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements