Handling deeply nested JSON with Python
Lee Hawthorn June 01, 2021 #PythonI first came across JSON back in 2015. As I predicted back then it's taken off in a huge way due to the growth in node based back ends and growth in JavaScript.
I do miss the structure of SQL databases.
To process JSON with Python we can use a list or dict comprehension, however, custom code is typically needed for different JSON data.
Another option is to flatten JSON to a key value dictionary.
I use a flatten function to do this. Thanks to a stack overflow post.
I've been able to use the function below to process very deeply nested JSON.
Look at the JSON below - taken from the Google Maps Distance API.
Here's the recursive function I use for different json sources.
"""
Turn a nested dictionary into a flattened dictionary
:param dictionary: The dictionary to flatten
:param parent_key: The string to prepend to dictionary's keys
:param separator: The string used to separate flattened keys
:param log : Bool used to control logging to the terminal
:return: A flattened dictionary
"""
=
= + +
return
Calling this function transforms the json into a flat dict:
Having done that it's simple to make any further transforms you need using a pandas
dataframe.
Here's some sample code as an example to extract distance values. Python version greater than 3.3 is required to run the code below.
=
# Flatten json to dict
=
# Load to dataframe
=
# Filter accordingly
=
Here's the complete code:
"""
Turn a nested dictionary into a flattened dictionary
:param dictionary: The dictionary to flatten
:param parent_key: The string to prepend to dictionary's keys
:param separator: The string used to separate flattened keys
:param log : Bool used to control logging to the terminal
:return: A flattened dictionary
"""
=
= + +
return
=
# Flatten json to dict
=
# Load to dataframe
=
# Filter accordingly
=