Python Support for bzip2 compression (bz2)


The bzip2 is an open source algorithm for compression and decompression of files. Python’s bz2 module provides functionality to implement bzip2 algorithm programmatically.

The open() function is the primary interface to this module.

Open()

This function opens a bzip2 compressed file and returns a file object. The file can be opened as binary/text mode with read/write permission. The function performs compression based on compressionlevel argument between 1 to 9.

write()

When the file is opened in ‘w’ or ‘wb’ mode, this function is available to the file object. In binary mode, it writes compressed binary data to the file. In normal text mode, the file object is wrapped in TetIOWrapper object to perform encoding.

read() − When opened in read mode, this function reads it and returns the uncompressed data.

Following code writes the compressed data to a bzip2 file.

>>> f = bz2.open("test.bz2", "wb")
>>> data = b'Welcome to TutorialsPoint'
>>> f.write(data)
>>> f.close()

This will create test.bz2 file in current directory. Any unzipping tool will show a ‘test’ file in it. To read the uncompressed data from this test.bz2 file use following code.

>>> f = bz2.open("test.bz2", "rb")
>>> data=f.read()
>>> data
b'Welcome to TutorialsPoint'

The bz2 module also defines BZ2File class. Its object acts as a compressor and decompressor depending upon mode parameter to the constructor.

BZ2File()

This is the constructor. As in open() function, file and mode parameters are required. The compressionlevel by default is 9 and can be between 1 to 9.

BZ2Compressor()

This function returns object of Incremental compressor class. Each call to compress() method in this class returns a chunk of compressed data. Multiple chunks can be concatenated together and finally written to the bzip2 compression file.

flush()

This method empties the buffer and returns chunk of data in it to be appended to the compressed object.

BZ2Decompressor()

This function returns incremental decompressor’s object. Individual chinks of decompressed data concatenated together with fludhed data form the uncompressed data.

Following example first compresses each iem in the list object and writes the concatenated byte object to the file. The data is retrieved by BZ2Decompressor object.

>>> data = [b'Hello World', b'How are you?', b'welcome to Python']
>>> obj = bz2.BZ2Compressor()
>>> f = bz2.open("test.bz2", "wb")
>>> d1 = obj.compress(data[0])
>>> d2 = obj.compress(data[1])
>>> d3 = obj.compress(data[2])
>>> d4 = obj.flush()
>>> d1,d2,d3,d4
(b'', b'', b'', b'BZh91AY&SYS\x9a~\x99\x00\x00\x03\x1f\x80@\x00\x00\x00\x80@@\x80.G\x96\xa0 \x00!\xa8\xd0\x06\x9a6\x90\xa6LL\x83#\x18\x1d\x83\xee^]\x1e|\xa9\xddgu\x15G/\x1a\x8c\xd1\x90\x14\x8f\x8b\xb9"\x9c(H)\xcd?L\x80')
>>> compressedobj=d1+d2+d3+d4
>>> f.write(compressedobj)
>>> f.close()
>>> obj=bz2.BZ2DeCompressor()
>>> f=bz2.open("test.bz2", "rb")
>>> data=f.read()
>>> obj.decompress(data)
b'Hello WorldHow are you?welcome to Python'

Updated on: 26-Jun-2020

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements