Analyze and Visualize Earthquake Data in Python with Matplotlib


Analyzing earthquake data using the matplotlib library of Python can provide valuable insights into the frequency, magnitude, and location of earthquakes, which can help in predicting and mitigating their impacts. In this article, we will explore how to analyze and visualize earthquake data using Python and Matplotlib, a popular data visualization library. We will show you step-by-step how to load earthquake data into Python, clean and preprocess the data, and create visualizations to better understand the patterns and trends in the data.

Introduction

The visual representation of data is more readily incorporated by human brains than the verbal representation of data. When something is pictured for us, we have an easier time understanding it. Datasets are the raw collection of information about a particular topic. In this article, we have an earthquake dataset that is in the form of a CSV file. We need to analyze and visualize the dataset to know the trends and patterns in the dataset so that we can make a prediction of what could happen in the future. For example, in this article we will be using an earthquake dataset, using matplotlib we will be analyzing and visualizing the data to know the pattern that what was the intensity of the earthquake in previous years, and then we can predict the intensity of future earthquakes.

For visualizing and analyzing the dataset we use a Python library called Matplotlib. We will discuss in detail what matplotlib is and how it is used to analyze and visualize the dataset.

Data Visualization

Graphics give a good tool for examining data, which is crucial for presenting findings. The term data visualization is new. It reflects the notion that includes more than just a graphical representation of facts (instead of using textual form).

This may be particularly useful for identifying and understanding a dataset, as well as for categorizing trends, faulty data, outliers, and more. With a little amount of subject expertise, data visualizations may be utilized to communicate and illustrate essential connections using plots and charts.

Matplotlib

In Python, there is a package known as Matplotlib, which is used for data visualization and is based on the Numpy array. It's useful for graphical user interfaces, shell scripts, web apps, and more.

In 2002, John D. Hunter was the one who initially developed the matplotlib. It is provided under a license similar to BSD, and it has a vibrant community of developers working on it. 2003 saw the release of the program's first version, and today, July 1, 2019, sees the release of the program's most recent version, 3.1.1.

With the release of Matplotlib version 1.2, support for Python3 was included. The current and last version of Matplotlib that is compatible with Python 2.6 is version 1.4.

Dataset

In this article dataset used is taken from the CORGIS dataset project and the name of the file is the earthquake CSV file.

We will be analyzing and preparing the dataset in the upcoming code.

Analyze and visualize earthquake data with matplotlib

Now we will see how to analyze and visualize earthquake data using Python and matplotlib.

Importing libraries and dataset

We will be importing all the important libraries first.

Pandas − It helps in analyzing the dataset and storing the data frame in a 2D array format.

Seaborn/Matplotlib − Both of these are used for visualizing the data

import pandas as pdd
import numpy as npp
import matplotlib.pyplot as pltt
import seaborn as sbb

Now let’s load the dataset. We will be loading the dataset in the Python dataframe so that it is easily accessible

df1 = pdd.read_csv('C:/Users/Tutorialspoint/Downloads/earthquakes.csv')
df1.shape

Output

(8394,19)

We will be looking at the data inside the columns to know how many null values are there in the data and the type of data in the column.

df1.info()

Output

For getting an idea of the distribution of the dataset we will be looking at some statistical measures of the dataset.

df1.describe()

Output

From the above description of the dataset, we can conclude that the max magnitude at which the earthquake occurred was 7.7 and the max depth was 622.

Exploratory data analysis

This analysis is used to explore the trends and patterns in data using the graphs and diagrams.

pltt.figure(figsize=(10, 5))
x1 = df1.groupby('time.year').mean()['location.depth']
x1.plot.bar()
pltt.show()

Output

From the above bar plot we can notice that after 2016 earthquakes are increasing and after 2017 it is gradually decreasing for 3 years and then it increases and again after 2021 it decreased.

fig1 = pltt.figure()
ax1 = fig1.add_axes([.1, .1, 2, 1])
ax1.plot(df1['impact.magnitude'])

Output

pltt.figure(figsize=(10, 5))
sbb.lineplot(data=df1,
   x='time.month',
   y='impact.magnitude')
pltt.show()

Output

We can see that the magnitude of earthquakes is decreasing after every month.

pltt.subplots(figsize=(15, 5))
 
pltt.subplot(1, 2, 1)
sbb.distplot(df1['location.depth'])
 
pltt.subplot(1, 2, 2)
sbb.boxplot(df1['location.depth'])
pltt.show()

Output

It is clear from looking at the distribution graph that there are several outliers, and this is something that can be verified by using the boxplot. The most important thing to take away from this, though, is that there is a left-skew in the distribution of the depth at which the earthquake rises.

pltt.subplots(figsize=(15, 5))
 
pltt.subplot(1, 2, 1)
sbb.distplot(df1['impact.magnitude'])
 
pltt.subplot(1, 2, 2)
sbb.boxplot(df1['impact.magnitude'])
 
pltt.show()

Output

pltt.figure(figsize=(20, 10))
sbb.scatterplot(data=df1,
   x='location.latitude',
   y='location.longitude',
   hue='impact.magnitude')
pltt.show()

Output

Now, we will look at the scatter plot for the data −

import plotly.express as pxx
import pandas as pdd
 
fig_w = pxx.scatter_geo(df1, lat='location.latitude',
   lon='location.longitude',
   color="impact.magnitude",
   scope='usa')
fig_w.show()

Output

From the above, we can see which areas are more prone to earthquakes in the USA.

Conclusion

In this article we learned about the python’s matplotlib library and how to use it. We have also analyzed and visualized the earthquake dataset using the matplotlib library.

Updated on: 01-Jun-2023

417 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements