How to improve file reading performance in Python with MMAP function?

Memory mapping (MMAP) allows Python to access file data directly through the operating system's virtual memory, bypassing traditional I/O operations. This technique significantly improves file reading performance by eliminating system calls and buffer copying.

How MMAP Works

MMAP maps file contents directly into memory, treating them as mutable strings or file-like objects. The mmap module supports methods like read(), write(), seek(), and slice operations.

Basic File Reading with MMAP

Let's create a sample file and demonstrate basic MMAP usage ?

import mmap

# Create sample text file
sample_text = """Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco.
Duis aute irure dolor in reprehenderit in voluptate velit esse."""

# Write to file
with open('sample.txt', 'w') as f:
    f.write(sample_text)

# Read using MMAP
with open('sample.txt', 'r') as f:
    with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as m:
        print("First 15 bytes:", m.read(15))
        print("Next 10 bytes:", m.read(10))
First 15 bytes: b'Lorem ipsum dol'
Next 10 bytes: b'or sit ame'

Modifying Files In-Place

MMAP allows direct file modification without loading the entire file into memory ?

import mmap
import shutil

# Create a copy for modification
shutil.copyfile('sample.txt', 'sample_copy.txt')

# Find and replace text
old_word = b'ipsum'
new_word = b'DOLOR'

with open('sample_copy.txt', 'r+') as f:
    with mmap.mmap(f.fileno(), 0) as m:
        print("Before:", m.readline().decode().strip())
        
        # Reset position and modify
        m.seek(0)
        pos = m.find(old_word)
        if pos != -1:
            m[pos:pos + len(old_word)] = new_word
            m.flush()
        
        # Show result
        m.seek(0)
        print("After:", m.readline().decode().strip())
Before: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
After: Lorem DOLOR dolor sit amet, consectetur adipiscing elit.

Copy-on-Write Access

Use ACCESS_COPY to modify data in memory without affecting the original file ?

import mmap

with open('sample.txt', 'r') as f:
    with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_COPY) as m:
        # Modify in memory only
        old_word = b'Lorem'
        new_word = b'HELLO'
        
        pos = m.find(old_word)
        if pos != -1:
            m[pos:pos + len(old_word)] = new_word
        
        m.seek(0)
        print("In-memory:", m.readline().decode().strip())

# Check original file unchanged
with open('sample.txt', 'r') as f:
    print("Original file:", f.readline().strip())
In-memory: HELLO ipsum dolor sit amet, consectetur adipiscing elit.
Original file: Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Performance Benefits

Feature Traditional I/O MMAP
System Calls Multiple per operation Minimal
Memory Copying Yes No
Random Access Slow Fast
Large Files Memory intensive Memory efficient

Conclusion

MMAP significantly improves file I/O performance by eliminating buffer copying and reducing system calls. It's ideal for large files requiring random access or in-place modifications while maintaining memory efficiency.

Updated on: 2026-03-25T11:58:10+05:30

911 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements