Python - How to select a subset of a Pandas DataFrame

A Pandas DataFrame is a two-dimensional data structure that allows you to select specific subsets of data. You can select single columns, multiple columns, or rows based on conditions using various methods.

Creating Sample Data

Let's create a sample DataFrame to demonstrate subset selection ?

import pandas as pd

# Create sample data
data = {
    'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
    'Reg_Price': [2500, 3500, 2500, 2000, 2500],
    'Units': [100, 80, 120, 70, 110]
}

dataFrame = pd.DataFrame(data)
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
       Car  Reg_Price  Units
0      BMW       2500    100
1    Lexus       3500     80
2     Audi       2500    120
3   Jaguar       2000     70
4  Mustang       2500    110

Selecting a Single Column

Use square brackets with the column name to select one column ?

import pandas as pd

data = {
    'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
    'Reg_Price': [2500, 3500, 2500, 2000, 2500],
    'Units': [100, 80, 120, 70, 110]
}

dataFrame = pd.DataFrame(data)

# Select single column
car_column = dataFrame['Car']
print("Single column (Car):")
print(car_column)
Single column (Car):
0        BMW
1      Lexus
2       Audi
3     Jaguar
4    Mustang
Name: Car, dtype: object

Selecting Multiple Columns

Pass a list of column names to select multiple columns ?

import pandas as pd

data = {
    'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
    'Reg_Price': [2500, 3500, 2500, 2000, 2500],
    'Units': [100, 80, 120, 70, 110]
}

dataFrame = pd.DataFrame(data)

# Select multiple columns
subset = dataFrame[['Car', 'Units']]
print("Multiple columns (Car and Units):")
print(subset)
Multiple columns (Car and Units):
       Car  Units
0      BMW    100
1    Lexus     80
2     Audi    120
3   Jaguar     70
4  Mustang    110

Selecting Rows by Index

Use iloc[] for position-based selection or loc[] for label-based selection ?

import pandas as pd

data = {
    'Car': ['BMW', 'Lexus', 'Audi', 'Jaguar', 'Mustang'],
    'Reg_Price': [2500, 3500, 2500, 2000, 2500],
    'Units': [100, 80, 120, 70, 110]
}

dataFrame = pd.DataFrame(data)

# Select first 3 rows
first_three = dataFrame.iloc[0:3]
print("First 3 rows:")
print(first_three)

# Select specific rows and columns
subset = dataFrame.loc[1:3, ['Car', 'Reg_Price']]
print("\nRows 1-3, specific columns:")
print(subset)
First 3 rows:
     Car  Reg_Price  Units
0    BMW       2500    100
1  Lexus       3500     80
2   Audi       2500    120

Rows 1-3, specific columns:
     Car  Reg_Price
1  Lexus       3500
2   Audi       2500
3 Jaguar       2000

Selection Methods Comparison

Method Use Case Example
df['col'] Single column df['Car']
df[['col1', 'col2']] Multiple columns df[['Car', 'Units']]
df.iloc[rows, cols] Position-based df.iloc[0:3, 1:3]
df.loc[rows, cols] Label-based df.loc[0:2, 'Car':'Units']

Conclusion

Use square brackets [] for simple column selection. Use iloc[] for position-based indexing and loc[] for label-based selection. These methods provide flexible ways to extract specific data subsets from your DataFrame.

Updated on: 2026-03-26T13:33:40+05:30

448 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements