Python object serialization (Pickle)

PythonProgrammingServer Side Programming

The term object serialization refers to process of converting state of an object into byte stream. Once created, this byte stream can further be stored in a file or transmitted via sockets etc. On the other hand reconstructing the object from the byte stream is called deserialization.

Python’s terminology for serialization and deserialization is pickling and unpickling respectively. The pickle module available in Python’s standard library provides functions for serialization (dump() and dumps()) and deserialization (load() and loads()).

The pickle module uses very Python specific data format. Hence, programs not written in Python may not be able to deserialize the encoded (pickled) data properly. Also it is not considered to be secure to unpickle data from un-authenticated source.

pickle protocols

Protocols are the conventions used in constructing and deconstructing Python objects to/from binary data. Currently pickle module defines 5 different protocols as listed below −

Protocol version 0Original “human-readable” protocol backwards compatible with earlier versions.
Protocol version 1Old binary format also compatible with earlier versions of Python.
Protocol version 2Introduced in Python 2.3 provides efficient pickling of new-style classes.
Protocol version 3Added in Python 3.0. recommended when compatibility with other Python 3 versions is required.
Protocol version 4was added in Python 3.4. It adds support for very large objects

To know highest and default protocol version of your Python installation use following constants defined in pickle module

>>> import pickle
>>> pickle.HIGHEST_PROTOCOL
4
>>> pickle.DEFAULT_PROTOCOL
3

As mentioned earlier dump() and load() functions of pickle module perform pickling and unpickling Python data. The dump() function writes pickled object to a file and load() function unpickles data from file to Python object.

Following program pickle a dictionary object into a binary file.

import pickle
f = open("data.txt","wb")
dct = {"name":"Ravi", "age":23, "Gender":"M","marks":75}
pickle.dump(dct,f)
f.close()

When above code is executed, the dictionary object’s byte representation will be stored in data.txt file.

To unpickle or deserialize data from binary file back to dictionary, run following program

import pickle
f = open("data.txt","rb")
d = pickle.load(f)
print (d)
f.close()

Python console shows the dictionary object read from file

{'age': 23, 'Gender': 'M', 'name': 'Ravi', 'marks': 75}

The pickle module also consists of dumps() function that returns a string representation of pickled data.

>>> from pickle import dump
>>> dct = {"name":"Ravi", "age":23, "Gender":"M","marks":75}
>>> dctstring = dumps(dct)
>>> dctstring
b'\x80\x03}q\x00(X\x04\x00\x00\x00nameq\x01X\x04\x00\x00\x00Raviq\x02X\x03\x00\x00\x00ageq\x03K\x17X\x06\x00\x00\x00Genderq\x04X\x01\x00\x00\x00Mq\x05X\x05\x00\x00\x00marksq\x06KKu.'

Use loads() function to unpickle the string and obtain original dictionary object.

>>> from pickle import load
>>> dct = loads(dctstring)
>>> dct
{'name': 'Ravi', 'age': 23, 'Gender': 'M', 'marks': 75}

The pickle module also defines Pickler and Unpickler classes. Pickler class writes pickle data to file. Unpickler class reads binary data from file and constructs Python object

To write Python object’s pickled data

from pickle import pickler
f = open("data.txt","wb")
dct = {'name': 'Ravi', 'age': 23, 'Gender': 'M', 'marks': 75}
Pickler(f).dump(dct)
f.close()

To read back data by unpickling binary file

from pickle import Unpickler
f = open("data.txt","rb")
dct = Unpickler(f).load()
print (dct)
f.close()

Objects of all Python standard data types are picklable. Moreover, objects of custom class can also be pickled and unpickled.

from pickle import *
class person:
def __init__(self):
self.name = "XYZ"
self.age = 22
def show(self):
print ("name:", self.name, "age:", self.age)
p1 = person()
f = open("data.txt","wb")
dump(p1,f)
f.close()
print ("unpickled")
f = open("data.txt","rb")
p1 = load(f)
p1.show()

Python library also has marshal module that is used for internal serialization of Python objects.

raja
Published on 27-Dec-2018 08:54:53
Advertisements