Highlight the maximum value in each column in Pandas


In data analysis and exploration tasks, identifying the maximum values within each column of a Pandas DataFrame is crucial for gaining insights and understanding the data. Python's Pandas library provides various techniques to highlight these maximum values, making them visually distinguishable. By applying these techniques, analysts can quickly spot and focus on the highest values, facilitating decision-making processes and uncovering key trends.

This article explores different methods, ranging from built-in functions to custom approaches, enabling users to highlight maximum values effortlessly within their data using Pandas.

How to highlight the maximum value in each column in Pandas?

Pandas, a popular data manipulation library in Python, provides powerful tools for working with tabular data. One common task is identifying and highlighting the maximum value in each column of a DataFrame. This article will explore various techniques to accomplish this using Pandas.

Method 1: Using Styler.highlight_max()

Pandas Styler offers a convenient method called highlight_max() to highlight the maximum value in each column. Styler.highlight_max() is a method provided by the Pandas library in Python. It is specifically designed to highlight the maximum value in each column of a DataFrame.

When working with tabular data, it is often useful to identify and emphasize the maximum values to gain insights or highlight important observations. The highlight_max() method simplifies this task by automatically applying a style to the maximum value in each column.

Example

# Import Required Libraries
import pandas as pdd
import numpy as npp

# Create a dictionary for the dataframe
diction = {'Name': ['Sai', 'Prema', 'Akrit', 'Suchitra', 'Abhimanu'],
   'Age': [20, 23, 41, 29, 32],
   'Marks': [92, 84, 35, 88, 83]}

# Converting Dictionary to Pandas Dataframe
dfd = pdd.DataFrame(diction)

# Print Dataframe
dfd
# Highlighting the maximum values of
# last 2 columns
dfd.style.highlight_max(color = 'pink', axis = 0)

Output

The highlight_max() method automatically highlights the maximum value in each column using a default style. The output will show the DataFrame with the maximum values highlighted.

Method 2: Using apply() with Styler

Another approach is to use the apply() method along with the Styler object to customize the highlighting based on specific criteria.

The apply() method, in conjunction with the Styler object in Pandas, allows customization of DataFrame styling. By defining a custom function and using apply(), you can apply specific styling rules to elements of the DataFrame. This method is particularly useful for highlighting specific values or cells based on certain conditions. With apply(), you can manipulate the style attributes of the DataFrame, such as background color, font style, or borders. This flexibility empowers you to create visually appealing and informative representations of your data, enhancing data exploration and presentation.

Example

# Import Required Libraries
import pandas as pdd
import numpy as npp

# Create a dictionary for the dataframe
diction = {'Name': ['Sai', 'Prema', 'Akrit', 'Suchitra', 'Abhimanu'],
   'Age': [20, 23, 41, 29, 32],
   'Marks': [92, 84, 35, 88, 83]}

# Converting Dictionary to Pandas Dataframe
dfd = pdd.DataFrame(diction)

# Print Dataframe
dfd

def h_max(s):
	is_max = s == s.max()
	return ['color: red' if cell else '' for cell in is_max]

dfd.style.apply(h_max)

Output

In this approach, we define a custom function highlight_max() that compares each value in a column to the maximum value using s == s.max(). It returns a list of style attributes, applying a yellow background color to the maximum value and leaving others blank. We then use df.style.apply() to apply this function to each column of the DataFrame, resulting in a highlighted DataFrame.

Conclusion

In conclusion, highlighting the maximum value in each column of a Pandas DataFrame is a valuable technique for data analysis and visualization. In this article, we explored two methods to achieve this: using Styler.highlight_max() and utilizing apply() with a custom function. These methods allow for easy identification and emphasis on the maximum values, providing insights into the data distribution and outliers.

Whether you prefer the simplicity of Styler.highlight_max() or the customization options of apply(), highlighting the maximum values enhances the visual representation of data, aiding in data exploration and communication.

Updated on: 24-Jul-2023

803 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements