Python - Processing JSON Data


Advertisements

JSON file stores data as text in human-readable format. JSON stands for JavaScript Object Notation. Pandas can read JSON files using the read_json function.

Input Data

Create a JSON file by copying the below data into a text editor like notepad. Save the file with .json extension and choosing the file type as all files(*.*).

{ 
   "ID":["1","2","3","4","5","6","7","8" ],
   "Name":["Rick","Dan","Michelle","Ryan","Gary","Nina","Simon","Guru" ]
   "Salary":["623.3","515.2","611","729","843.25","578","632.8","722.5" ],
   
   "StartDate":[ "1/1/2012","9/23/2013","11/15/2014","5/11/2014","3/27/2015","5/21/2013",
      "7/30/2013","6/17/2014"],
   "Dept":[ "IT","Operations","IT","HR","Finance","IT","Operations","Finance"]
}

Read the JSON File

The read_json function of the pandas library can be used to read the JSON file into a pandas DataFrame.

import pandas as pd

data = pd.read_json('path/input.json')
print (data)

When we execute the above code, it produces the following result.

         Dept  ID    Name  Salary   StartDate
0          IT   1    Rick  623.30    1/1/2012
1  Operations   2     Dan  515.20   9/23/2013
2          IT   3   Tusar  611.00  11/15/2014
3          HR   4    Ryan  729.00   5/11/2014
4     Finance   5    Gary  843.25   3/27/2015
5          IT   6   Rasmi  578.00   5/21/2013
6  Operations   7  Pranab  632.80   7/30/2013
7     Finance   8    Guru  722.50   6/17/2014

Reading Specific Columns and Rows

Similar to what we have already seen in the previous chapter to read the CSV file, the read_json function of the pandas library can also be used to read some specific columns and specific rows after the JSON file is read to a DataFrame. We use the multi-axes indexing method called .loc() for this purpose. We choose to display the Salary and Name column for some of the rows.

import pandas as pd
data = pd.read_json('path/input.xlsx')

# Use the multi-axes indexing funtion
print (data.loc[[1,3,5],['salary','name']])

When we execute the above code, it produces the following result.

   salary   name
1   515.2    Dan
3   729.0   Ryan
5   578.0  Rasmi

Reading JSON file as Records

We can also apply the to_json function along with parameters to read the JSON file content into individual records.

import pandas as pd
data = pd.read_json('path/input.xlsx')

print(data.to_json(orient='records', lines=True))

When we execute the above code, it produces the following result.

{"Dept":"IT","ID":1,"Name":"Rick","Salary":623.3,"StartDate":"1\/1\/2012"}
{"Dept":"Operations","ID":2,"Name":"Dan","Salary":515.2,"StartDate":"9\/23\/2013"}
{"Dept":"IT","ID":3,"Name":"Tusar","Salary":611.0,"StartDate":"11\/15\/2014"}
{"Dept":"HR","ID":4,"Name":"Ryan","Salary":729.0,"StartDate":"5\/11\/2014"}
{"Dept":"Finance","ID":5,"Name":"Gary","Salary":843.25,"StartDate":"3\/27\/2015"}
{"Dept":"IT","ID":6,"Name":"Rasmi","Salary":578.0,"StartDate":"5\/21\/2013"}
{"Dept":"Operations","ID":7,"Name":"Pranab","Salary":632.8,"StartDate":"7\/30\/2013"}
{"Dept":"Finance","ID":8,"Name":"Guru","Salary":722.5,"StartDate":"6\/17\/2014"}

Useful Video Courses

Video

Python Online Training

187 Lectures 17.5 hours

Malhar Lathkar

Video

Python Essentials Online Training

55 Lectures 8 hours

Arnab Chakraborty

Video

Learn Python Programming in 100 Easy Steps

136 Lectures 11 hours

In28Minutes Official

Video

Python with Data Science

Best Seller

75 Lectures 13 hours

Eduonix Learning Solutions

Video

Python 3 from scratch to become a developer in demand

Best Seller

70 Lectures 8.5 hours

Lets Kode It

Video

Python Data Science basics with Numpy, Pandas and Matplotlib

Most Popular

63 Lectures 6 hours

Abhilash Nelson

Advertisements