Drop a list of rows from a Pandas DataFrame


The pandas library in python is widely popular for representing data in the form of tabular data structures. The dataset is arranged into a 2-D matrix consisting of rows and columns. Pandas library offers numerous functions that can help the programmer to analyze the dataset by providing valuable mathematical insights.

The tabular data structure is known as a data frame that can be generated with the help of pandas DataFrame() function. In this article we will perform a simple operation of removing/dropping multiple rows from a pandas data frame.

Firstly, we have to prepare a dataset and then generate a data frame with the help of pandas “DataFrame()” function. Let’s begin with this −

Preparing the Dataset

The data from the passed dataset will be arranged in the form of rows and columns.

  • Here, we imported the pandas library as “pd”. We created the dataset with the help of a dictionary of lists.

  • Each key represents a student which is associated with a list of values representing the marks obtained in different subjects.

  • After this, we generated a data frame with the help of DataFrame() function. We didn’t specify the column name but the student’s name automatically acquires the column position for this data frame. The most important step is the labelling of the data frame indexes. We specified the row names by passing a list of values consisting of different subjects.

Example

import pandas as pd
dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],}
dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"])
print(dataframe)

Output

             Aman  Raj  Saloni
Physics        98   78     82
Chemistry      92   62     52
Maths          88   90     95
English        90   71     98
Biology        91   45     80

Dropping Rows through Index Values

For dropping a row we will use the pandas “drop()” method. This is an efficient and simple way of removing rows from a data frame. Following is the syntax of this method –

dataframe.drop(labels=None, *, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

We don’t require all the parameters for initiating the “drop” operation (Most of the default values will be enough). There are two techniques for dropping rows: -

We will specify the index value for each row that needs to be dropped.

Example

Following is the implementation of this method. Here,

  • After creating the data frame we used the drop() method to remove the 3rd and 4th row from the data frame.

  • We selected the original data frame stored in the “dataframe” variable and locked the index values for the corresponding rows that we wanted to remove with the help “dataframe.index[[]]

  • A new data frame is created consisting of the remaining rows.

import pandas as pd
dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],}
dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"])
print(dataframe)
Drop_dataframe = dataframe.drop(dataframe.index[[2, 3]])
print("After dropping 3rd and 4th row")
print(Drop_dataframe)

Output

           Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Maths        88   90      95
English      90   71      98
Biology      91   45      80
After dropping 3rd and 4th row
           Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Biology      91   45      80

Dropping Rows Through Labels or Row Names

In this technique, we use the exact name of the rows(labels) which we want to drop from the data frame. We will again use drop() method to execute this technique. now,

  • We used the same drop() method to remove the 3rd and 4th row from the data frame but this time we used the row name which we labelled while constructing the data frame.

  • A new data frame is created and the original data frame remains unchanged.

Example

import pandas as pd
dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],}
dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"])
print(dataframe)
Drop_dataframe = dataframe.drop(["Maths", "English"])
print("After dropping 3rd and 4th row")
print(Drop_dataframe)

Output

           Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Maths        88   90      95
English      90   71      98
Biology      91   45      80
After dropping 3rd and 4th row
           Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Biology      91   45      80

We can also include the “inplace” argument, if we don’t want to create another data frame. This argument can modify the current data frame by making changes in it. The default value is “False” for this argument. We will set the inplace argument value as “True”.

Using Index Slicing

We can also drop a list of rows using the index slicing. Following is the example to do so,

  • Here, we sliced the index and created a range for dropping rows.

  • We printed the original data frame and then used the “dataframe.index[2:4]” method to set the range from 2 to 3 and “dataframe.drop()” method to drop these rows.

  • At last, a new data frame will be created consisting of the remaining rows.

Example

import pandas as pd
dataset = {"Aman":[98, 92, 88, 90, 91], "Raj":[78, 62, 90, 71, 45], "Saloni":[82, 52, 95, 98, 80],}
dataframe = pd.DataFrame(dataset,index=["Physics", "Chemistry", "Maths", "English", "Biology"])
print(dataframe)
drop_dataframe = dataframe.drop(dataframe.index[2:4])
print("After dropping 3rd and 4th row")
print(drop_dataframe)

Output

            Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Maths        88   90      95
English      90   71      98
Biology      91   45      80
After dropping 3rd and 4th row
             Aman  Raj  Saloni
Physics      98   78      82
Chemistry    92   62      52
Biology      91   45      80

Conclusion

In this article, we covered the basics of pandas data frame. We understood the different methods to drop multiple rows from a data frame. We discussed the different ways of specifying the rows which we want to remove i.e., through “index value” and “row name”. At last, we discussed a simple index slicing method.

Updated on: 05-May-2023

271 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements