How to read a JSON file into a DataFrame using Python Pandas library?


JSON stands for JavaScript Object Notation, it stores the text data in the form of key/value pairs and this can be a human-readable data format. These JSON files are often used to exchange data on the web. The JSON object is represented in between curly brackets ({}). Each key/value pair of JSON is separated by a comma sign.

JSON data looks very similar to a python dictionary, but JSON is a data format whereas a dictionary is a data structure. To read JSON files into pandas DataFrame we have the read_json method in the pandas library. Below examples give you the overview of how we can read JSON files into a pandas DataFrame.

Example

Reading local JSON file into the pandas DataFrame

#importing pandas package
import pandas as pd

# reading JSON file
df = pd.read_json('E:\iris.json')

# displaying sample output
df.sample(5)

Explanation

In the above code, we have read the local JSON file into the df variable by using the pd.read_json method, we pass the JSON file location as a string to this method. This method will automatically convert the data in JSON files into DataFrame. Initially, we imported the pandas package as pd. Finally, In the last line, we displayed the 5 sample lines from DataFrame as output.

Output

    sepalLength   sepalWidth   petalLength    petalWidth    species
149     5.9          3.0          5.1           1.8         virginica
90      5.5          2.6          4.4           1.2         versicolor
56      6.3          3.3          4.7           1.6        versicolor
38 4.4 3.0 1.3 0.2 setosa
85 6.0 3.4 4.5 1.6 versicolor

Iris JSON data set/file is taken as input to the read_json method, this data set has 5 columns and 150 rows of data. In this output block, we only displayed a sample of 5 rows of data by using the df.sample() method. This method randomly returns the data from the DataFrame.

In this same way, we can read remote JSON data, by mentioning the remote URL in the place of the file path.

Example

import pandas as pd

data = pd.read_json(
'http://universities.hipolabs.com/search?country=United+Kingdom')
print(data)

Explanation

In this example, we have taken a public HTTP API that is holding data in JSON format. Here also the read_json method is used to read this remote URL JSON data.

Output


domainsweb_pagesnamealpha _two_ codestate-provincecountry
0[abdn.ac.UK, Aberdeen.ac.uk][www.abdn.ac.uk/]University of AberdeenGBNaNUnited Kingdom
1[aber.ac.uk][www.aber.ac.uk/]University of Wales, AberystwythGBNaNUnited Kingdom
2[abertay.ac.uk][www.abertay.ac.uk/]University of Abertay DundeeGBNaNUnited Kingdom
3[aiuniv.edu][www.aiuniv.edu/]American InterContinental University - LondonGBNaNUnited Kingdom
4[aku.edu][www.aku.edu/]Aga Khan UniversityGBNaNUnited Kingdom

This output has sample 4 rows of JSON data from the read_json method, this is the data from a public URL and it has 171 rows and 6 columns.

Updated on: 18-Nov-2021

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements