Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Limited rows selection with given column in Pandas
Pandas is the go-to library for data manipulation in Python. One common task is selecting a limited number of rows from specific columns in a DataFrame. This article demonstrates various methods to accomplish this with practical examples.
What is Row and Column Selection?
Row and column selection allows you to extract subsets of your DataFrame based on position, labels, or conditions. This is essential for data analysis, preprocessing, and creating focused views of your data.
Method 1: Using iloc for Position-Based Selection
The iloc method selects rows and columns by their integer positions. It's useful when you know exactly which positions you want ?
import pandas as pd
# Create a sample DataFrame
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda', 'Mike'],
'Age': [28, 24, 35, 32, 30],
'City': ['New York', 'Paris', 'Berlin', 'London', 'Sydney'],
'Salary': [50000, 45000, 60000, 55000, 52000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Original DataFrame:
Name Age City Salary
0 John 28 New York 50000
1 Anna 24 Paris 45000
2 Peter 35 Berlin 60000
3 Linda 32 London 55000
4 Mike 30 Sydney 52000
Now let's select the first 3 rows from specific columns ?
import pandas as pd
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda', 'Mike'],
'Age': [28, 24, 35, 32, 30],
'City': ['New York', 'Paris', 'Berlin', 'London', 'Sydney'],
'Salary': [50000, 45000, 60000, 55000, 52000]
}
df = pd.DataFrame(data)
# Select first 3 rows, columns 0 and 1 (Name and Age)
subset = df.iloc[:3, [0, 1]]
print(subset)
Name Age
0 John 28
1 Anna 24
2 Peter 35
Method 2: Using loc for Label-Based Selection
The loc method selects rows and columns by their labels or names, making your code more readable ?
import pandas as pd
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda', 'Mike'],
'Age': [28, 24, 35, 32, 30],
'City': ['New York', 'Paris', 'Berlin', 'London', 'Sydney'],
'Salary': [50000, 45000, 60000, 55000, 52000]
}
df = pd.DataFrame(data)
# Select first 3 rows by index (0 to 2) and specific columns by name
subset = df.loc[:2, ['Name', 'Age', 'City']]
print(subset)
Name Age City
0 John 28 New York
1 Anna 24 Paris
2 Peter 35 Berlin
Method 3: Using Boolean Indexing with Limited Results
Boolean indexing filters rows based on conditions. You can combine it with head() to limit the results ?
import pandas as pd
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda', 'Mike'],
'Age': [28, 24, 35, 32, 30],
'City': ['New York', 'Paris', 'Berlin', 'London', 'Sydney'],
'Salary': [50000, 45000, 60000, 55000, 52000]
}
df = pd.DataFrame(data)
# Select rows where Age > 30, show only Name and City, limit to 2 results
subset = df[df['Age'] > 30][['Name', 'City']].head(2)
print(subset)
Name City
2 Peter Berlin
3 Linda London
Method 4: Using head() and tail() with Column Selection
Combine head() or tail() with column selection for simple cases ?
import pandas as pd
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda', 'Mike'],
'Age': [28, 24, 35, 32, 30],
'City': ['New York', 'Paris', 'Berlin', 'London', 'Sydney'],
'Salary': [50000, 45000, 60000, 55000, 52000]
}
df = pd.DataFrame(data)
# Get first 3 rows with selected columns
first_three = df[['Name', 'Salary']].head(3)
print("First 3 rows:")
print(first_three)
print("\nLast 2 rows:")
last_two = df[['Name', 'Age']].tail(2)
print(last_two)
First 3 rows:
Name Salary
0 John 50000
1 Anna 45000
2 Peter 60000
Last 2 rows:
Name Age
3 Linda 32
4 Mike 30
Comparison of Methods
| Method | Use Case | Syntax | Best For |
|---|---|---|---|
iloc |
Position-based selection | df.iloc[:n, [col_indices]] |
When you know exact positions |
loc |
Label-based selection | df.loc[:n, ['col_names']] |
More readable, uses column names |
Boolean + head()
|
Conditional selection | df[condition][cols].head(n) |
Filtering with row limits |
head()/tail() |
Simple top/bottom selection | df[cols].head(n) |
Quick first/last N rows |
Conclusion
Pandas offers multiple methods for selecting limited rows with specific columns. Use iloc for position-based selection, loc for label-based selection, and combine boolean indexing with head() for conditional filtering. Choose the method that best fits your data analysis needs.
