Internal Python object serialization (marshal)

Even though marshal module in Python’s standard library provides object serialization features (similar to pickle module), it is not really useful for general purpose data persistence or transmission of Python objects through sockets etc. This module is mostly used by Python itself to support read/write operations on compiled versions of Python modules (.pyc files). The data format used by the marshal module is not compatible across Python versions (not even subversions). That’s why a compiled Python script (.pyc file) of one version most probably won’t execute on another. The marshal module is thus used for Python’s internal object serialization.

Just as pickle module, marshal module also defined load() and dump() functions for reading and writing marshaled objects from / to file. Also, loads() and dumps() function deal with a string representation of marshaled object.

dumps() − returns a byte like an object my marshalling a Python object. Only objects of standard data types are supported for marshalling. Unsupported types raise ValueError exception.

loads() − This function converts the byte like object to corresponding Python object. If the conversion doesn’t result in valid Python object, ValueError or TypeError may be raised.

Following code shows a Python dictionary object marshalled using dumps(). The byte representation is converted back to the dictionary by loads() function.

import marshal
person = {"name":"xyz", "age":22, "marks":[45,56,78]}
data = marshal.dumps(person)
obj = marshal.loads(data)
print (obj)

dump() − This function writes byte representation of supported Python object to a file. The file itself be a binary file with write permission

load() − This function reads the byte data from a binary file and converts it to Python object.

As mentioned above the marshal module is used to process .pyc files. Following example demonstrates the use of dump() and load() functions to handle code objects of Python, which are used to store precompiled Python modules.

The code uses built-in compile() function to build a code object out of a source string which embeds Python instructions.

compile(source, file, mode)

The file parameter should be the file from which the code was read. If it wasn’t read from a file pass any arbitrary string.

The mode parameter is ‘exec’ if the source contains a sequence of statements, ‘eval’ if there is a single expression or ‘single’ if it contains a single interactive statement.

The compile code object is then stored in a .pyc file using dump() function

import marshal
script = """
a = 10
b = 20
print ('addition = ',a+b)
code = compile(script, "script", "exec")
f = open("a.pyc","wb")
marshal.dump(code, f)

To deserialize the object from .pyc file use load() function. Since it returns a code object, it can be run using exec(), another built-in function.

import marshal
f = open("a.pyc","rb")
data = marshal.load(f)
exec (data)

The output will be a result of code block embedded in the source string

addition = 30