Difference between Shallow Copy vs Deep Copy in Pandas Dataframe

One of the most useful data structures in Pandas is the DataFrame ? a 2-dimensional table-like structure containing rows and columns to store data. Understanding the difference between shallow and deep copies is crucial when manipulating DataFrames, as it affects how changes propagate between original and copied objects.

What is Shallow Copy?

A shallow copy creates a new DataFrame object that references the original data. The copied DataFrame points to the same memory location as the original DataFrame. Any modifications to the underlying data will affect both the original and shallow copy.

Syntax

df_shallow = df.copy(deep=False)

When deep=False, only the DataFrame structure is copied, but the underlying data remains shared between both objects.

What is Deep Copy?

A deep copy creates a completely independent copy of a DataFrame, including all its data and metadata. The new DataFrame has its own memory space and is entirely separate from the original.

Syntax

df_deep = df.copy(deep=True)  # deep=True is default

The deep parameter defaults to True, creating an independent DataFrame with copied data and index labels.

Example: Comparing Shallow vs Deep Copy

Let's demonstrate the key differences with a practical example ?

import pandas as pd

# Create original DataFrame
data = {'Name': ['Rahul', 'Priya', 'Amit'],
        'Age': [25, 28, 22],
        'City': ['Mumbai', 'Delhi', 'Kolkata']}
df_original = pd.DataFrame(data)

# Create copies
df_shallow = df_original.copy(deep=False)
df_deep = df_original.copy(deep=True)

# Modify original DataFrame
df_original.loc[0, 'Age'] = 30

print("Original DataFrame:")
print(df_original)
print("\nShallow Copy:")
print(df_shallow)  
print("\nDeep Copy:")
print(df_deep)
Original DataFrame:
   Name  Age     City
0  Rahul   30   Mumbai
1  Priya   28    Delhi
2   Amit   22  Kolkata

Shallow Copy:
   Name  Age     City
0  Rahul   30   Mumbai
1  Priya   28    Delhi
2   Amit   22  Kolkata

Deep Copy:
   Name  Age     City
0  Rahul   25   Mumbai
1  Priya   28    Delhi
2   Amit   22  Kolkata

Notice how modifying the original DataFrame affected the shallow copy but not the deep copy. This demonstrates that shallow copies share the underlying data.

Memory and Independence Example

Here's another example showing structural changes ?

import pandas as pd

# Create DataFrame
data = {'Country': ['USA', 'Germany', 'Japan'],
        'Population': [328, 83, 126]}
df_original = pd.DataFrame(data)

# Create copies
df_shallow = df_original.copy(deep=False)
df_deep = df_original.copy(deep=True)

# Modify deep copy independently
df_deep.loc[2, 'Country'] = 'India'
df_deep.loc[2, 'Population'] = 1400

# Add new row to original (structural change)
new_row = pd.DataFrame({'Country': ['Canada'], 'Population': [38]})
df_original = pd.concat([df_original, new_row], ignore_index=True)

print("Original DataFrame:")
print(df_original)
print("\nShallow Copy:")
print(df_shallow)
print("\nDeep Copy:")
print(df_deep)

# Check memory addresses
print(f"\nMemory addresses:")
print(f"Original: {id(df_original)}")
print(f"Shallow:  {id(df_shallow)}")
print(f"Deep:     {id(df_deep)}")
Original DataFrame:
   Country  Population
0      USA         328
1  Germany          83
2    Japan         126
3   Canada          38

Shallow Copy:
   Country  Population
0      USA         328
1  Germany          83
2    Japan         126

Deep Copy:
   Country  Population
0     USA         328
1  Germany          83
2   India        1400

Memory addresses:
Original: 140234567890123
Shallow:  140234567890456  
Deep:     140234567890789

Comparison Table

Aspect Shallow Copy Deep Copy
Data Sharing Shares underlying data Independent data copy
Memory Usage Low (shared memory) High (duplicated data)
Performance Faster creation Slower creation
Independence Data changes affect both Completely independent
Use Case Temporary views, large data Independent manipulation

When to Use Each Type

Use Shallow Copy when:

  • Working with large datasets where memory is a concern
  • Creating temporary views for analysis
  • You want changes to reflect across both objects

Use Deep Copy when:

  • You need completely independent DataFrames
  • Performing different transformations on each copy
  • Want to preserve the original data unchanged

Conclusion

Shallow copies share underlying data and are memory-efficient but changes affect both objects. Deep copies create independent DataFrames with separate memory, ideal when you need complete isolation between original and copied data.

Updated on: 2026-03-27T11:53:19+05:30

492 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements