Get last n records of a Pandas DataFrame


Data analysis frequently faces the issue of working with enormous datasets, which frequently necessitates data modification to yield valuable insights. The ability to extract the most recent n entries from a Pandas DataFrame might be helpful in certain circumstances. The goal of this article is to offer a thorough how-to manual for successfully doing this activity.

Installation and Syntax

pip install pandas

Once Pandas is installed, you can use a CSV file or the results of a database query to create a DataFrame from a variety of data sources.

import pandas as pd
data = {'Name': ['John', 'Mark', 'Alice', 'Julie', 'Lisa', 'David'],
        'Age': [23, 34, 45, 19, 28, 31],
        'Gender': ['M', 'M', 'F', 'F', 'F', 'M']}
df = pd.DataFrame(data)

Algorithm

The tail() function of a Pandas DataFrame retrieves the last n rows of a DataFrame and may be used to get the last n entries of a DataFrame. The following stages make up the algorithm used to complete this task:

  • The shape parameter may be used to determine how many rows there are in the DataFrame.

  • To get the DataFrame's final n rows, use the tail() function.

  • You might choose to set the index of the produced DataFrame to 0.

Example

Let's now look at an example that demonstrates how to retrieve the last 3 records of a DataFrame. We will use the DataFrame created in the previous section.

import pandas as pd

data = {'Name': ['John', 'Mark', 'Alice', 'Julie', 'Lisa', 'David'],
        'Age': [23, 34, 45, 19, 28, 31],
        'Gender': ['M', 'M', 'F', 'F', 'F', 'M']}

df = pd.DataFrame(data)

last_3_records = df.tail(3)
print(last_3_records)

Output

    Name  Age Gender
3  Julie   19      F
4   Lisa   28      F
5  David   31      M

The resulting last_3_records DataFrame will contain the last three rows of the df DataFrame. Optionally, we can reset the index of the resulting DataFrame to start from 0 using the reset_index() function.

last_3_records = last_3_records.reset_index(drop=True)

The resulting last_3_records DataFrame will have an index that starts from 0.

    Name  Age Gender
0  Julie   19      F
1   Lisa   28      F
2  David   31      M

Applications

Retrieving the last n records of a Pandas DataFrame can prove useful in various scenarios, such as:

  • Analyzing the most recent data entries in a large dataset.

  • Testing and validating data insertion into a DataFrame.

  • Building machine learning models using recent data entries.

Conclusion

In conclusion, retrieving the last n records of a Pandas DataFrame is a simple and efficient task that can be accomplished using the tail() function. We can optionally reset the index of the resulting DataFrame to start from 0. This technique can be useful in various scenarios, such as analyzing recent data entries or validating data insertion into a DataFrame.

Updated on: 18-Jul-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements