Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to find numeric columns in Pandas?
To find numeric columns in Pandas, we can use the select_dtypes() method to filter columns based on their data types. This method allows us to specify which numeric types to include or exclude from our DataFrame.
Basic Example
Let's start with a simple example using select_dtypes() ?
import pandas as pd
# Create a DataFrame with mixed data types
df = pd.DataFrame({
'name': ['John', 'Jacob', 'Tom', 'Tim', 'Ally'],
'marks': [89, 23, 100, 56, 90],
'subjects': ["Math", "Physics", "Chemistry", "Biology", "English"]
})
print("Input DataFrame:")
print(df)
print("\nData types:")
print(df.dtypes)
Input DataFrame:
name marks subjects
0 John 89 Math
1 Jacob 23 Physics
2 Tom 100 Chemistry
3 Tim 56 Biology
4 Ally 90 English
Data types:
name object
marks int64
subjects object
dtype: object
Method 1: Using select_dtypes() with Specific Types
Select columns by specifying exact numeric data types ?
import pandas as pd
df = pd.DataFrame({
'name': ['John', 'Jacob', 'Tom', 'Tim', 'Ally'],
'marks': [89, 23, 100, 56, 90],
'subjects': ["Math", "Physics", "Chemistry", "Biology", "English"]
})
# Specify exact numeric types
numeric_types = ['int16', 'int32', 'int64']
numeric_df = df.select_dtypes(include=numeric_types)
print("Numeric columns only:")
print(numeric_df)
Numeric columns only: marks 0 89 1 23 2 100 3 56 4 90
Method 2: Using select_dtypes() with 'number'
Use the generic 'number' type to include all numeric columns ?
import pandas as pd
import numpy as np
# Create DataFrame with various numeric types
df = pd.DataFrame({
'integers': [1, 2, 3, 4],
'floats': [1.5, 2.7, 3.2, 4.1],
'names': ['A', 'B', 'C', 'D'],
'booleans': [True, False, True, False]
})
print("All columns:")
print(df.dtypes)
# Select all numeric columns
numeric_df = df.select_dtypes(include='number')
print("\nNumeric columns:")
print(numeric_df)
All columns: integers int64 floats float64 names object booleans bool dtype: object Numeric columns: integers floats 0 1 1.5 1 2 2.7 2 3 3.2 3 4 4.1
Method 3: Getting Column Names Only
If you only need the column names, use the columns attribute ?
import pandas as pd
df = pd.DataFrame({
'name': ['John', 'Jacob', 'Tom'],
'age': [25, 30, 28],
'salary': [50000.0, 60000.0, 55000.0],
'department': ['IT', 'HR', 'Finance']
})
# Get numeric column names
numeric_columns = df.select_dtypes(include='number').columns.tolist()
print("Numeric column names:", numeric_columns)
# Alternative: using numpy
numeric_cols_np = df.select_dtypes(include=[np.number]).columns.tolist()
print("Using np.number:", numeric_cols_np)
Numeric column names: ['age', 'salary'] Using np.number: ['age', 'salary']
Comparison of Methods
| Method | Usage | Best For |
|---|---|---|
| Specific types list | include=['int64', 'float64'] |
Exact type control |
| Generic 'number' | include='number' |
All numeric types |
| NumPy number | include=[np.number] |
NumPy compatibility |
Conclusion
Use select_dtypes(include='number') for finding all numeric columns in a Pandas DataFrame. This method is flexible and handles various numeric data types automatically.
Advertisements
