Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Python Pandas - Basic Functionality

Quiz

Pandas is a powerful data manipulation library in Python, providing essential tools to work with data in both Series and DataFrame formats. These two data structures are crucial for handling and analyzing large datasets.

Understanding the basic functionalities of Pandas, including its attributes and methods, is essential for effectively managing data, these attributes and methods provide valuable insights into your data, making it easier to understand and process. In this tutorial you will learn about the basic attributes and methods in Pandas that are crucial for working with these data structures.

Working with Attributes in Pandas

Attributes in Pandas allow you to access metadata about your Series and DataFrame objects. By using these attributes you can explore and easily understand the data.

Series and DataFrame Attributes

Following are the widely used attribute of the both Series and DataFrame objects −

Sr.No.	Attribute & Description
1	dtype Returns the data type of the elements in the Series or DataFrame.
2	index Provides the index (row labels) of the Series or DataFrame.
3	values Returns the data in the Series or DataFrame as a NumPy array.
4	shape Returns a tuple representing the dimensionality of the DataFrame (rows, columns).
5	ndim Returns the number of dimensions of the object. Series is always 1D, and DataFrame is 2D.
6	size Gives the total number of elements in the object.
7	empty Checks if the object is empty, and returns True if it is.
8	columns Provides the column labels of the DataFrame object.

Example

Let's create a Pandas Series and explore these attributes operation.

import pandas as pd
import numpy as np

# Create a Series with random numbers
s = pd.Series(np.random.randn(4))

# Exploring attributes
print("Data type of Series:", s.dtype)
print("Index of Series:", s.index)
print("Values of Series:", s.values)
print("Shape of Series:", s.shape)
print("Number of dimensions of Series:", s.ndim)
print("Size of Series:", s.size)
print("Is Series empty?:", s.empty)

Its output is as follows −

Data type of Series: float64
Index of Series: RangeIndex(start=0, stop=4, step=1)
Values of Series: [-1.02016329  1.40840089  1.36293022  1.33091391]
Shape of Series: (4,)
Number of dimensions of Series: 1
Size of Series: 4
Is Series empty?: False

Example

Let's look at below example and understand working of these attributes on a DataFrame object.

import pandas as pd
import numpy as np

# Create a DataFrame with random numbers
df = pd.DataFrame(np.random.randn(3, 4), columns=list('ABCD'))
print("DataFrame:")
print(df)

print("Results:")
print("Data types:", df.dtypes)
print("Index:", df.index)
print("Columns:", df.columns)
print("Values:")
print(df.values)
print("Shape:", df.shape)
print("Number of dimensions:", df.ndim)
print("Size:", df.size)
print("Is empty:", df.empty)

On executing the above code you will get the following output −

DataFrame:
          A         B         C         D
0  2.161209 -1.671807 -1.020421 -0.287065
1  0.308136 -0.592368 -0.183193  1.354921
2 -0.963498 -1.768054 -0.395023 -2.454112

Results:
Data types: 
A    float64
B    float64
C    float64
D    float64
dtype: object
Index: RangeIndex(start=0, stop=3, step=1)
Columns: Index(['A', 'B', 'C', 'D'], dtype='object')
Values:
[[ 2.16120893 -1.67180742 -1.02042138 -0.28706468]
 [ 0.30813618 -0.59236786 -0.18319262  1.35492058]
 [-0.96349817 -1.76805364 -0.3950226  -2.45411245]]
Shape: (3, 4)
Number of dimensions: 2
Size: 12
Is empty: False

Exploring Basic Methods in Pandas

Pandas offers several basic methods in both the data structures, that makes it easy to quickly look at and understand your data. These methods help you get a summary and explore the details without much effort.

Series and DataFrame Methods

Sr.No.	Method & Description
1	head(n) Returns the first n rows of the object. The default value of n is 5.
2	tail(n) Returns the last n rows of the object. The default value of n is 5.
3	info() Provides a concise summary of a DataFrame, including the index dtype and column dtypes, non-null values, and memory usage.
4	describe() Generates descriptive statistics of the DataFrame or Series, such as count, mean, std, min, and max.

Example

Let us now create a Series and see the working of the Series basic methods.

import pandas as pd
import numpy as np

# Create a Series with random numbers
s = pd.Series(np.random.randn(10))

print("Series:")
print(s)

# Using basic methods
print("First 5 elements of the Series:\n", s.head())
print("\nLast 3 elements of the Series:\n", s.tail(3))
print("\nDescriptive statistics of the Series:\n", s.describe())

Its output is as follows −

Series:
0   -0.295898
1   -0.786081
2   -1.189834
3   -0.410830
4   -0.997866
5    0.084868
6    0.736541
7    0.133949
8    1.023674
9    0.669520
dtype: float64
First 5 elements of the Series:
 0   -0.295898
1   -0.786081
2   -1.189834
3   -0.410830
4   -0.997866
dtype: float64

Last 3 elements of the Series:
 7    0.133949
8    1.023674
9    0.669520
dtype: float64

Descriptive statistics of the Series:
 count    10.000000
mean     -0.103196
std       0.763254
min      -1.189834
25%      -0.692268
50%      -0.105515
75%       0.535627
max       1.023674
dtype: float64

Example

Now look at below example and understand working of the basic methods on a DataFrame object.

import pandas as pd
import numpy as np

#Create a Dictionary of series
data = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
   'Age':pd.Series([25,26,25,23,30,29,23]), 
   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 
#Create a DataFrame
df = pd.DataFrame(data)
print("Our data frame is:\n")
print(df)

# Using basic methods
print("\nFirst 5 rows of the DataFrame:\n", df.head())
print("\nLast 3 rows of the DataFrame:\n", df.tail(3))
print("\nInfo of the DataFrame:")
df.info()
print("\nDescriptive statistics of the DataFrame:\n", df.describe())

On executing the above code you will get the following output −

Our data frame is:

    Name  Age  Rating
0    Tom   25    4.23
1  James   26    3.24
2  Ricky   25    3.98
3    Vin   23    2.56
4  Steve   30    3.20
5  Smith   29    4.60
6   Jack   23    3.80

First 5 rows of the DataFrame:
     Name  Age  Rating
0    Tom   25    4.23
1  James   26    3.24
2  Ricky   25    3.98
3    Vin   23    2.56
4  Steve   30    3.20

Last 3 rows of the DataFrame:
     Name  Age  Rating
4  Steve   30     3.2
5  Smith   29     4.6
6   Jack   23     3.8

Info of the DataFrame:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Name    7 non-null      object 
 1   Age     7 non-null      int64  
 2   Rating  7 non-null      float64
dtypes: float64(1), int64(1), object(1)
memory usage: 296.0+ bytes

Descriptive statistics of the DataFrame:
              Age    Rating
count   7.000000  7.000000
mean   25.857143  3.658571
std     2.734262  0.698628
min    23.000000  2.560000
25%    24.000000  3.220000
50%    25.000000  3.800000
75%    27.500000  4.105000
max    30.000000  4.600000

Print Page