Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Get first n records of a Pandas DataFrame
Working with large datasets in Pandas can often be a daunting task, especially when it comes to retrieving the first few records of a dataset. In this article, we will explore the various ways to get the first n records of a Pandas DataFrame.
Installation and Setup
We must make sure that Pandas is installed on our system before moving further with the implementation ?
pip install pandas
Once installed, we can create a DataFrame or load a CSV and then retrieve the first N records.
Methods to Get First n Records
A Pandas DataFrame's first n entries can be obtained in several ways. Here are the most commonly used techniques ?
df.head(n) Retrieve the first n rows of the DataFrame. The default value of n is 5 if not specified.
df.iloc[:n] Get the first n rows of the DataFrame using integer-based indexing.
df.loc[:n-1] Fetch the first n rows of the DataFrame using label-based indexing.
df[:n] This is the slice operator in Python which gets the first n rows using implicit slicing.
Example Implementation
Let's create a sample DataFrame and demonstrate each method to retrieve the first 5 records ?
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({
'Name': ['John', 'Mary', 'Peter', 'Jane', 'Mike', 'Alex', 'Sandy', 'Ben', 'Alice', 'Cooper'],
'Age': [25, 32, 18, 45, 27, 39, 32, 19, 29, 18],
'Country': ['USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada']
})
print("Original DataFrame:")
print(df)
Original DataFrame:
Name Age Country
0 John 25 USA
1 Mary 32 Canada
2 Peter 18 UK
3 Jane 45 Australia
4 Mike 27 USA
5 Alex 39 Canada
6 Sandy 32 UK
7 Ben 19 Australia
8 Alice 29 USA
9 Cooper 18 Canada
Using head() Method
The head() method is the most common and recommended way to get the first n records ?
import pandas as pd
df = pd.DataFrame({
'Name': ['John', 'Mary', 'Peter', 'Jane', 'Mike', 'Alex', 'Sandy', 'Ben'],
'Age': [25, 32, 18, 45, 27, 39, 32, 19],
'Country': ['USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia']
})
# Get first 5 records using head()
print("First 5 records using head():")
print(df.head(5))
First 5 records using head():
Name Age Country
0 John 25 USA
1 Mary 32 Canada
2 Peter 18 UK
3 Jane 45 Australia
4 Mike 27 USA
Using iloc Indexing
The iloc method uses integer-based indexing to select rows ?
import pandas as pd
df = pd.DataFrame({
'Name': ['John', 'Mary', 'Peter', 'Jane', 'Mike', 'Alex', 'Sandy', 'Ben'],
'Age': [25, 32, 18, 45, 27, 39, 32, 19],
'Country': ['USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia']
})
# Get first 3 records using iloc
print("First 3 records using iloc:")
print(df.iloc[:3])
First 3 records using iloc:
Name Age Country
0 John 25 USA
1 Mary 32 Canada
2 Peter 18 UK
Using Slice Operator
The slice operator provides a simple way to get the first n records ?
import pandas as pd
df = pd.DataFrame({
'Name': ['John', 'Mary', 'Peter', 'Jane', 'Mike', 'Alex', 'Sandy', 'Ben'],
'Age': [25, 32, 18, 45, 27, 39, 32, 19],
'Country': ['USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia']
})
# Get first 4 records using slice operator
print("First 4 records using slice operator:")
print(df[:4])
First 4 records using slice operator:
Name Age Country
0 John 25 USA
1 Mary 32 Canada
2 Peter 18 UK
3 Jane 45 Australia
Comparison of Methods
| Method | Syntax | Best For | Performance |
|---|---|---|---|
head() |
df.head(n) |
General purpose, most readable | Fast |
iloc |
df.iloc[:n] |
Integer-based selection | Fast |
| Slice operator | df[:n] |
Quick selection, Pythonic | Fast |
Common Use Cases
Getting the first n records has several practical applications ?
Exploratory data analysis Quick way to understand the structure and content of the data.
Data sampling Extract a subset of data for testing and training in machine learning.
Data visualization Plot a subset of data for better performance and clarity.
Data validation Check data quality and format before processing large datasets.
Conclusion
The head() method is the most commonly used and recommended approach for getting the first n records of a DataFrame. Use iloc for integer-based indexing needs and the slice operator for quick, Pythonic selection. All methods provide fast performance for data exploration and analysis.
