Convert JSON to Pandas DataFrame in Python

[This article was first published on PyShark, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this article we will discuss how to convert JSON to Pandas DataFrame in Python.

Table of Contents

  • Introduction
  • Create a sample JSON file
  • Convert simple JSON to Pandas DataFrame
  • Convert nested JSON to Pandas DataFrame
  • Conclusion

Introduction

All data science projects begin with accessing the data and reading it correctly. With a large availability of APIs to query large volumes of data from a variety of sources, JSON objects became a popular source for the projects’ data.

As you are working in Python, most likely you would want your data to be in a format of a list or a DataFrame.

Let’s see how we can quickly convert JSON to Pandas DataFrame in Python.

To continue following this tutorial we will need the two Python libraries: json and pandas.

Make sure your Pandas version is >= 1.0.3. You can check it by running:

import pandas as pd

print(pd.__version__)

If your version is less than 1.0.3, please update your Pandas by running the following code in your Command Prompt (or Terminal):

pip install --upgrade pandas

Create a Sample JSON File

As the first step we will create a few sample JSON file that we will later convert to a Pandas DataFrame.

The first file will be a very simple one:

[
    {
        "userId": 1,
        "firstName": "Jake",
        "lastName": "Taylor",
        "phoneNumber": "123456",
        "emailAddress": "[email protected]"
    },
    {
        "userId": 2,
        "firstName": "Brandon",
        "lastName": "Glover",
        "phoneNumber": "123456",
        "emailAddress": "[email protected]"
    }
]

Let’s save it as sample.json in the same location as your Python code.

And the second file will be a nested JSON file:

[
    {
        "userId": 1,
        "firstName": "Jake",
        "lastName": "Taylor",
        "phoneNumber": "123456",
        "emailAddress": "[email protected]",
        "courses": {
            "course1": "mathematics",
            "course2": "physics",
            "course3": "engineering"
        }
    },
    {
        "userId": 2,
        "firstName": "Brandon",
        "lastName": "Glover",
        "phoneNumber": "123456",
        "emailAddress": "[email protected]",
        "courses": {
            "course1": "english",
            "course2": "french",
            "course3": "sociology"
        }
    }
]

Let’s save it as nested_sample.json in the same location as your Python code.


Convert simple JSON to Pandas DataFrame in Python

Reading a simple JSON file is very simple using .read_json() Pandas method. It parses a JSON string and converts it to a Pandas DataFrame:

import pandas as pd

df = pd.read_json("sample.json")

Let’s take a look at the JSON converted to DataFrame:

print(df)

We get exactly the content of the JSON file converted to a DataFrame.


Convert nested JSON to Pandas DataFrame in Python

When comparing nested_sample.json with sample.json you see that the structure of the nested JSON file is different as we added the courses field which contains a list of values in it.

In this case, to convert it to Pandas DataFrame we will need to use the .json_normalize() method. It works differently than .read_json() and normalizes semi-structured JSON into a flat table:

import pandas as pd
import json

with open('nested_sample.json','r') as f:
    data = json.loads(f.read())

df = pd.json_normalize(data)

Let’s take a look at the JSON converted to DataFrame:

print(df)

We get exactly the content of the JSON file converted to a DataFrame.


Conclusion

In this article we discussed how to convert JSON to Pandas DataFrame in Python using json and pandas libraries.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming articles.

The post Convert JSON to Pandas DataFrame in Python appeared first on PyShark.

To leave a comment for the author, please follow the link and comment on their blog: PyShark.

Want to share your content on python-bloggers? click here.