Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Drop columns in DataFrame by label Names or by Index Positions
A pandas DataFrame is a 2D data structure for storing tabular data. When working with DataFrames, you often need to remove unwanted columns. This can be done by specifying column names or their index positions using the drop() method.
In this tutorial, we'll explore different methods to drop columns from a pandas DataFrame including dropping by names, index positions, and ranges.
Creating the Sample DataFrame
Let's start by creating a sample DataFrame to work with ?
import pandas as pd
dataset = {
"Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"],
"Age": [25, 28, 27, 26, 25],
"Salary": [200000, 250000, 180000, 300000, 280000],
"Role": ["Junior Developer", "Analyst", "Programmer", "Senior Developer", "HR"]
}
df = pd.DataFrame(dataset, index=["Nimesh", "Arjun", "Mohan", "Ritesh", "Raghav"])
print(df)
Employee ID Age Salary Role
Nimesh CIR45 25 200000 Junior Developer
Arjun CIR12 28 250000 Analyst
Mohan CIR18 27 180000 Programmer
Ritesh CIR50 26 300000 Senior Developer
Raghav CIR28 25 280000 HR
Method 1: Drop Columns by Names
The most straightforward way is to specify column names directly. Use axis=1 to indicate column operations ?
import pandas as pd
dataset = {
"Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"],
"Age": [25, 28, 27, 26, 25],
"Salary": [200000, 250000, 180000, 300000, 280000],
"Role": ["Junior Developer", "Analyst", "Programmer", "Senior Developer", "HR"]
}
df = pd.DataFrame(dataset, index=["Nimesh", "Arjun", "Mohan", "Ritesh", "Raghav"])
# Drop columns by name
result = df.drop(["Role", "Salary"], axis=1)
print("After dropping Role and Salary columns:")
print(result)
After dropping Role and Salary columns:
Employee ID Age
Nimesh CIR45 25
Arjun CIR12 28
Mohan CIR18 27
Ritesh CIR50 26
Raghav CIR28 25
Method 2: Drop Columns by Index Positions
You can also drop columns using their index positions. Use df.columns[[indices]] to select columns by position ?
import pandas as pd
dataset = {
"Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"],
"Age": [25, 28, 27, 26, 25],
"Salary": [200000, 250000, 180000, 300000, 280000],
"Role": ["Junior Developer", "Analyst", "Programmer", "Senior Developer", "HR"]
}
df = pd.DataFrame(dataset, index=["Nimesh", "Arjun", "Mohan", "Ritesh", "Raghav"])
# Drop columns by index (Salary=2, Role=3)
result = df.drop(df.columns[[2, 3]], axis=1)
print("After dropping columns at positions 2 and 3:")
print(result)
After dropping columns at positions 2 and 3:
Employee ID Age
Nimesh CIR45 25
Arjun CIR12 28
Mohan CIR18 27
Ritesh CIR50 26
Raghav CIR28 25
Method 3: Drop Range of Columns Using iloc
Use iloc to select a range of columns by position. Note that iloc excludes the end index ?
import pandas as pd
dataset = {
"Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"],
"Age": [25, 28, 27, 26, 25],
"Salary": [200000, 250000, 180000, 300000, 280000],
"Role": ["Junior Developer", "Analyst", "Programmer", "Senior Developer", "HR"]
}
df = pd.DataFrame(dataset, index=["Nimesh", "Arjun", "Mohan", "Ritesh", "Raghav"])
# Drop columns from index 1 to 4 (Age, Salary, Role)
result = df.drop(df.columns[1:4], axis=1)
print("After dropping columns from position 1 to 3:")
print(result)
After dropping columns from position 1 to 3:
Employee ID
Nimesh CIR45
Arjun CIR12
Mohan CIR18
Ritesh CIR50
Raghav CIR28
Method 4: Drop Range of Columns Using loc
Use loc with column names to drop a range. Unlike iloc, loc includes both start and end columns ?
import pandas as pd
dataset = {
"Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"],
"Age": [25, 28, 27, 26, 25],
"Salary": [200000, 250000, 180000, 300000, 280000],
"Role": ["Junior Developer", "Analyst", "Programmer", "Senior Developer", "HR"]
}
df = pd.DataFrame(dataset, index=["Nimesh", "Arjun", "Mohan", "Ritesh", "Raghav"])
# Drop columns from Age to Role (inclusive)
result = df.drop(df.loc[:, "Age":"Role"].columns, axis=1)
print("After dropping columns from Age to Role:")
print(result)
After dropping columns from Age to Role:
Employee ID
Nimesh CIR45
Arjun CIR12
Mohan CIR18
Ritesh CIR50
Raghav CIR28
Comparison of Methods
| Method | Syntax | Best For |
|---|---|---|
| By Names | df.drop(['col1', 'col2'], axis=1) |
Known column names |
| By Index | df.drop(df.columns[[0, 1]], axis=1) |
Specific positions |
| Range with iloc | df.drop(df.columns[1:4], axis=1) |
Position-based ranges |
| Range with loc | df.drop(df.loc[:, 'col1':'col3'].columns, axis=1) |
Label-based ranges |
Conclusion
Pandas provides flexible methods to drop columns from DataFrames. Use column names for clarity, index positions for programmatic operations, and loc/iloc for range-based dropping. Always specify axis=1 when dropping columns.
