Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - Select multiple columns from a Pandas dataframe
Selecting multiple columns from a Pandas DataFrame is a common operation in data analysis. You can select specific columns using square brackets with column names to create a subset of your data.
Basic Syntax
To select multiple columns, use double square brackets with a list of column names ?
# Syntax: df[['column1', 'column2', 'column3']]
Creating Sample Data
Let's create a sample DataFrame to demonstrate column selection ?
import pandas as pd
# Create sample sales data
data = {
'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
'Reg_Price': [2500, 3500, 2500, 2000, 2500],
'Units': [100, 80, 120, 70, 110],
'Discount': [5, 10, 8, 12, 6]
}
dataFrame = pd.DataFrame(data)
print("Complete DataFrame:")
print(dataFrame)
Complete DataFrame:
Car Reg_Price Units Discount
0 BMW 2500 100 5
1 Lexus 3500 80 10
2 Audi 2500 120 8
3 Jaguar 2000 70 12
4 Mustang 2500 110 6
Selecting Two Columns
Select specific columns by passing their names in a list ?
import pandas as pd
data = {
'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
'Reg_Price': [2500, 3500, 2500, 2000, 2500],
'Units': [100, 80, 120, 70, 110],
'Discount': [5, 10, 8, 12, 6]
}
dataFrame = pd.DataFrame(data)
# Select Reg_Price and Units columns
selected_columns = dataFrame[['Reg_Price', 'Units']]
print("Selected columns (Reg_Price and Units):")
print(selected_columns)
Selected columns (Reg_Price and Units): Reg_Price Units 0 2500 100 1 3500 80 2 2500 120 3 2000 70 4 2500 110
Selecting Multiple Columns
You can select any number of columns by including them in the list ?
import pandas as pd
data = {
'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
'Reg_Price': [2500, 3500, 2500, 2000, 2500],
'Units': [100, 80, 120, 70, 110],
'Discount': [5, 10, 8, 12, 6]
}
dataFrame = pd.DataFrame(data)
# Select three columns
selected_columns = dataFrame[['Car', 'Reg_Price', 'Discount']]
print("Selected columns (Car, Reg_Price, and Discount):")
print(selected_columns)
Selected columns (Car, Reg_Price, and Discount):
Car Reg_Price Discount
0 BMW 2500 5
1 Lexus 3500 10
2 Audi 2500 8
3 Jaguar 2000 12
4 Mustang 2500 6
Using Variables for Column Names
Store column names in a variable for reusability and cleaner code ?
import pandas as pd
data = {
'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
'Reg_Price': [2500, 3500, 2500, 2000, 2500],
'Units': [100, 80, 120, 70, 110],
'Discount': [5, 10, 8, 12, 6]
}
dataFrame = pd.DataFrame(data)
# Define columns to select
columns_to_select = ['Car', 'Units']
result = dataFrame[columns_to_select]
print("Using variable for column selection:")
print(result)
Using variable for column selection:
Car Units
0 BMW 100
1 Lexus 80
2 Audi 120
3 Jaguar 70
4 Mustang 110
Conclusion
Use double square brackets df[['col1', 'col2']] to select multiple columns from a Pandas DataFrame. This creates a new DataFrame containing only the specified columns, which is useful for data analysis and visualization tasks.
