Memory-mapped file support in Python (mmap)?


When you read a file object to a python program and want to modify, it can be done in two ways. First way is to modify the content in the physical storage drive where the file is located and the second way is to modify it directly in the memory or Ram of the system. In this article we will see how to read ,search and modify the content of a file object using the mmap module available in python. Instead of making system calls such as open, read and lseek to manipulate a file, memory-mapping puts the data of the file into memory which allows you to directly manipulate files in memory.

Read Memory Mapped File

In the below example we read a complete file into the memory at once and keep it in memory as a file object. Then we access it in the read mode. Finally as you can see the entire file represents a object from which we slice certain positions to get the required text.

Example

import mmap

def read_mmap(fname):
   with open(fname, mode="r", encoding="utf8") as fobj:
      with mmap.mmap(fobj.fileno(), length=0, access=mmap.ACCESS_READ) as mmap_obj:
         print(mmap_obj[4:26])

read_mmap('E:\test.txt')

Output

Running the above code gives us the following result −

'emissions from gaseous'

Find using mmap

Example

import mmap
import time

def regular_io_find(fname):
   with open(fname, mode="r", encoding="utf-8") as fobj:
      text = fobj.read()
      text.find("Death ")

def mmap_io_find(fname):
   with open(fname, mode="r", encoding="utf-8") as fobj:
      with mmap.mmap(fobj.fileno(), length=0, access=mmap.ACCESS_READ) as mmap_obj:
         mmap_obj.find(b"Death ")

start_time_r = time.time()
regular_io_find('E:\emissions.txt')
end_time_r = time.time()
print("Regualr read start time :",start_time_r)
print("Regualr read start time :",end_time_r)
print('Regular read time : {0}'.format(end_time_r - start_time_r))

start_time_m = time.time()
mmap_io_find('E:\emissions.txt')
end_time_m = time.time()
print("mmap read start time :",start_time_m)
print("mmap read start time :",end_time_m)
print('mmap read time : {0}'.format(end_time_m - start_time_m))

Output

Running the above code gives us the following result −

2013
Regualr read start time : 1609812463.2718163
Regualr read end time : 1609812463.2783241
Regular read time to find: 0.00650787353515625
mmap read start time : 1609812463.2783241
mmap read start time : 1609812463.2783241
mmap read time to find : 0.0

Writing to file

In the below example we take a file and open it with mmap module with access code as r+ which allows both reading and writing to the file. After creating the file object we choose a position by slicing where we can write a string.

Example

import mmap

def mmap_io_write(fname):
   with open(fname, mode="r+") as fobj:
      with mmap.mmap(fobj.fileno(), length=0, access=mmap.ACCESS_WRITE) as mmap_obj:
         mmap_obj[20:26] = b"Hello!"
         mmap_obj.flush()

mmap_io_write('E:\emissions.txt')

On running the above code we can open the file and see the string Hello! Written into the file at the byte position from 20 to 26.

Updated on: 12-Jan-2021

980 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements