Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Write a program in Python to modify the diagonal of a given DataFrame by 1
In data analysis, we often need to modify specific elements of a DataFrame. One common task is modifying the diagonal elements − where the row index equals the column index (positions [0,0], [1,1], [2,2], etc.).
Understanding the Diagonal
The diagonal of a square DataFrame consists of elements where row index equals column index. For a 3x3 DataFrame, the diagonal elements are at positions (0,0), (1,1), and (2,2).
Method 1: Using Nested Loops
We can iterate through all positions and check if the row index equals column index ?
import pandas as pd
data = [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Modify diagonal elements to 1
for i in range(len(df)):
for j in range(len(df.columns)):
if i == j:
df.iloc[i, j] = 1
print("\nModified DataFrame:")
print(df)
Original DataFrame:
0 1 2
0 10 20 30
1 40 50 60
2 70 80 90
Modified DataFrame:
0 1 2
0 1 20 30
1 40 1 60
2 70 80 1
Method 2: Using numpy.fill_diagonal()
NumPy provides a more efficient fill_diagonal() function ?
import pandas as pd
import numpy as np
data = [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Modify diagonal using numpy
np.fill_diagonal(df.values, 1)
print("\nModified DataFrame:")
print(df)
Original DataFrame:
0 1 2
0 10 20 30
1 40 50 60
2 70 80 90
Modified DataFrame:
0 1 2
0 1 20 30
1 40 1 60
2 70 80 1
Comparison
| Method | Performance | Readability | Best For |
|---|---|---|---|
| Nested Loops | Slower | More verbose | Learning purposes |
fill_diagonal() |
Faster | Concise | Production code |
Conclusion
Use numpy.fill_diagonal() for efficient diagonal modification in production code. The nested loop approach helps understand the concept but is less efficient for large DataFrames.
