- Related Questions & Answers
- Python Data analysis and Visualization
- Data Analysis and Visualization in Python?
- Data visualization with different Charts in Python?
- Exploratory Data Analysis in Python
- Data analysis using Python Pandas
- Replacing strings with numbers in Python for Data Analysis
- Twitter Sentiment Analysis using Python Program
- Compare trend analysis and comparative analysis.
- Performing text data analysis and Search capability in SAP HANA
- Write the difference between comparative analysis and common size analysis.
- Twitter Sentiment Analysis using Python
- Twitter Sentiment Analysis using Python Programming.
- Analysis of Different Methods to find Prime Number in Python program
- Asymptotic Analysis
- Amortized Analysis

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

In this tutorial, we are going to learn about data analysis and visualization using modules like **pandas** and **matplotlib** in *Python*. Python is an excellent fit for the data analysis things. Install the modules **pandas** and **matplotlib** using the following commands.

pip install pandas

pip install matplotlib

You will get a success message after the completion of the installation process. We will first learn about the **pandas** and then will see **matplotlib**.

Pandas is an open-source library of Python which provides data analysis tools. We are going to see some useful methods from the **pandas** for data analysis.

We need multiple rows to create a **DataFrame**. Let's see how to do it.

# importing the pands package import pandas as pd # creating rows hafeez = ['Hafeez', 19] aslan = ['Aslan', 21] kareem = ['Kareem', 18] # pass those Series to the DataFrame # passing columns as well data_frame = pd.DataFrame([hafeez, aslan, kareem], columns = ['Name', 'Age']) # displaying the DataFrame print(data_frame)

If you run the above program, you will get the following results.

Name Age 0 Hafeez 19 1 Aslan 21 2 Kareem 18

Go to the link and download **CSV** file. The data in the **CSV** will be in rows with a comma(,) separated. Let's see how to import and use the data using **pandas**.

# importing pandas package import pandas as pd # importing the data using pd.read_csv() method data = pd.read_csv('CountryData.IND.csv') # displaying the first 5 rows using data.head() method print(data.head())

If you run the above program, you will get the following results.

Let's see how many rows and columns are there using the shape variable.

# importing pandas package import pandas as pd # importing the data using pd.read_csv() method data = pd.read_csv('CountryData.IND.csv') # no. of rows and columns print(data.shape)

If you run the above program, you will get the following results.

(29, 16)

We have a method called **describe()** which computes various statistics excluding **NaN**. Let's see it once.

# importing pandas package import pandas as pd # importing the data using pd.read_csv() method data = pd.read_csv('CountryData.IND.csv') # no. of rows and columns print(data.describe())

If you run the above program, you will get the following results.

We have package **matplotlib** to create graphs using the data. Let's see how to create various types of graphs using **matplotlib**.

# importing the pyplot module to create graphs import matplotlib.pyplot as plot # importing the data using pd.read_csv() method data = pd.read_csv('CountryData.IND.csv') # creating a histogram of Time period data['Time period'].hist(bins = 10)

If you run the above program, you will get the following results.

<matplotlib.axes._subplots.AxesSubplot at 0x25e363ea8d0>

We can create different types of graphs using the **matplotlib** package.

If you have any doubts regarding the tutorial, mention them in the comment section.

Advertisements