Get first n records of a Pandas DataFrame


Working with large datasets in Pandas can often be a daunting task, especially when it comes to retrieving the first few records of a dataset. In this article, we will explore the various ways to get the first n records of a Pandas DataFrame.

Installation and Syntax

We must make sure that Pandas is installed on our system before moving further with the implementation so run the pip command in your terminal −

pip install pandas

Once installed, we can create a DataFrame or load a CSV and then retrieve the first N records.

Algorithm

A Pandas DataFrame's first n entries can be obtained in a number of ways but we will be sticking to the most used techniques and functions such as −

  • df.head(n) − Retrieve the first n rows of the DataFrame. The default value of n is 5 if not specified.

  • df.iloc[:n] − Get the first n rows of the DataFrame using integer-based indexing.

  • df.loc[:n] − Fetch the first n rows of the DataFrame using label-based indexing.

  • df[:n] − This is the slice operator in Python which is well known to get a subset of elements from lists, strings and iterables. So this syntax gets the first N specific rows denoted by the slicing operator implicitly.

Example

Suppose we have a dataset containing information about several individuals, and we want to explore different ways to retrieve the top 10 rows of this data. To do so, we can load this data into a Pandas DataFrame and analyze the different methods available.

import pandas as pd

# Make a placeholder dataframe
df = pd.DataFrame(
   {
      'Name': ['John', 'Mary', 'Peter', 'Jane', 'Mike', 'Alex', 'Sandy', 'Ben', 'Alice', 'Mary', 'Cooper', 'Darth', 'Vader'],
        
      'Age': [25, 32, 18, 45, 27, 39, 32, 19, 29, 32, 18, 45, 27],
        
      'Country': ['USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia', 'USA', 'Canada', 'UK', 'Australia', 'USA']
   }
)


# Retrieve first 5 records using df.head(n)
print(df.head(5), end="\n-------------------\n")

# Retrieve first 5 records using df.iloc[:n]
print(df.iloc[:5], end="\n-------------------\n")

# Retrieve first 5 records using df.loc[:n]
print(df.loc[:5], end="\n-------------------\n")

# Retrieve first 5 records using df[:n]
print(df[:5], end="\n-------------------\n")

Output

    Name  Age    Country
0   John   25        USA
1   Mary   32     Canada
2  Peter   18         UK
3   Jane   45  Australia
4   Mike   27        USA
-------------------
    Name  Age    Country
0   John   25        USA
1   Mary   32     Canada
2  Peter   18         UK
3   Jane   45  Australia
4   Mike   27        USA
-------------------
    Name  Age    Country
0   John   25        USA
1   Mary   32     Canada
2  Peter   18         UK
3   Jane   45  Australia
4   Mike   27        USA
5   Alex   39     Canada
-------------------
    Name  Age    Country
0   John   25        USA
1   Mary   32     Canada
2  Peter   18         UK
3   Jane   45  Australia
4   Mike   27        USA
-------------------

Explanation

  • df.head(n) returns the first 10 rows of the DataFrame.

  • df.iloc[:10] returns the first 10 rows of the DataFrame using integer-based indexing.

  • df.loc[:9] returns the first 10 rows of the DataFrame using label-based indexing.

  • df[:10] returns the first 10 rows of the DataFrame using the slice operator.

Applications

Data analysis requires the capacity to quickly access the first n entries of a DataFrame. This has a number of uses, including −

  • Exploratory data analysis − A rapid way to understand the structure and content of the data.

  • To extract a subset of data for machine learning's testing and training purposes.

  • Plotting a subset of data will improve data visualization.

Conclusion

In this post, we studied a range of Pandas strategies, including df.head(n), df.iloc[:n], df.loc[:n], and df[:n] to capture the first N specified rows. We also looked at the many methods for obtaining a Pandas DataFrame's first n entries. You can deal with enormous datasets rapidly, comprehend the data type and nature of the dataset, and analyze them effectively if you are familiar with these procedures.

Updated on: 18-Jul-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements