Python Pandas - Filtering columns from a DataFrame on the basis of sum

Python Server Side Programming Programming

To filter on the basis of sum of columns, we use the loc() method. Here, in our example, we sum the marks of each student to get the student column with marks above 400 i.e. 80%.

At first, create a DataFrame with student records. We have marks records of 3 students i.e 3 columns −

dataFrame = pd.DataFrame({
   'Jacob_Marks': [95, 90, 75, 85, 88],'Ted_Marks': [60, 50, 65, 85, 70],'Jamie_Marks': [77, 76, 65, 45, 50]})

Filtering on the basis of columns. Fetching student with total marks above 400 −

dataFrame = dataFrame.loc[:, dataFrame.sum(axis=0) > 400]

Example

Following is the complete code −

import pandas as pd

# create a dataframe with 3 columns
dataFrame = pd.DataFrame({
   'Jacob_Marks': [95, 90, 75, 85, 88],'Ted_Marks': [60, 50, 65, 85, 70],'Jamie_Marks': [77, 76, 65, 45, 50]})

print"Dataframe...\n",dataFrame

# filtering on the basis of columns
# fetching student with total marks above 400
dataFrame = dataFrame.loc[:, dataFrame.sum(axis=0) > 400]

# dataframe
print"Updated Dataframe...\n",dataFrame

Output

This will produce the following output −

Dataframe...
   Jacob_Marks   Jamie_Marks   Ted_Marks
0          95            77          60
1          90            76          50
2          75            65          65
3          85            45          85
4          88            50          70
Updated Dataframe...
   Jacob_Marks
0          95
1          90
2          75
3          85
4          88

AmitDiwan

Updated on: 2021-09-16T06:40:49+05:30

904 Views

Kickstart Your Career

Get certified by completing the course

Get Started