I suppose the same could be said of extracting values from nested JSON structures. Even the most skilled programmer can be brought to tears when working with a JSON object that consists of a mix of deeply nested data structures. The process of extracting the values can feel messy and disorganized at best. The more data there is, the bigger the mess. Show In this tutorial, I’ll walk you through a step-by-step method to extract the values you need from any JSON. A word of warning: this tutorial is not meant for newbies to JSON, lists or dictionaries. If you’ve never heard of a list index or a dictionary key-value pair, I would suggest reviewing one of the many great tutorials available on the web or YouTube. Once you feel more comfortable with the subject, come back to continue learning and growing. HousekeepingJSON vs. Lists vs. DictionariesFirst things first, when it comes to the terms “JSON”, “list” and “dictionary”, we have to do some important housekeeping. JSON, or JavaScript Object Notation, is a broader format used to encompass dictionary and list structures as shown in the image below. JSON: List and Dictionary Structure, Image by Author.The technical documentation says a JSON object is built on two structures: a list of key-value pairs and an ordered list of values. In Python Programming, key-value pairs are dictionary objects and ordered list are list objects. In practice, the starting point for the extraction of nested data starts with either a dictionary or list data structure. When extracting Nested Data the question(s) should be: Is the data nested in a Dictionary or List data structure? What is the combination of data structures used? Is the first data structure used a dictionary or a list?
If it seems like I’m making a big deal about the terminology, it is because I am. When it comes to extracting nested data the details matter. Data structures change the deeper the data is nested in the JSON structure and knowing those distinctions are important. The initial data structure may be a list but then change to a dictionary as the data is extracted. The key to extracting data from a JSON object is recognizing the mix of data structures used to store the data. If you struggle to recognize the data structure in a JSON object, it’s likely that you’ll struggle to extract the values you want. In most cases, this results in applying the wrong extraction technique. The table below is a brief refresher on the techniques used to extract data from a JSON structure. Data Types and Extraction Methods, Image by AuthorOne final note before starting our example. In Python Programming, the term “data structure” is rarely used when describing lists and dictionary. The commonly used term is “data type”. I use the terms data type and data structure interchangeably throughout this tutorial. I use the term data structure because it conveys the idea that the data structures are the fundamental building blocks of the JSON object. The usage of the term data type in Python is not of less importance however it does not convey the same meaning as a key to understanding nested data extraction. Real World DataLet's Get StartedOne of the best ways to learn is by working through real data with a mix of list and dictionary data structures. In this tutorial, we’ll use real data from the . This API returns about 250 records with a mix of dictionaries, lists and other data types. Our objective is to extract the The Sample CodeClicking this link will allow you to access the sample code in the following examples. The link will take you to a course I developed on learning to extract nested JSON data. The course has helped hundreds of students learn to extract nested data. You don’t have to purchase the course to obtain the files. The filenames are single_json.py and multiple_json.py. Extracting Single ItemsIn this example, we’ll start by extracting data using a combination of list and dictionary extraction techniques as shown in the preceding table. In the Python code below, I start by providing the logic to import the data, the solution and then the workflow to derive the solution. I recommend following all the steps as shown below. The workflow steps are explained below the Python code. Python Code: Workflow Steps:
So, let’s summarize the two steps that are repeated until we arrive at the value we want to extract. The first step is to determine the data type and the second step is to apply the extraction method. If the datatype is list, then use the index operator with square brackets. However, if the datatype is a dictionary, use the dictionary key with curly brackets. Extracting Multiple ItemsWhile extracting a single list item from a JSON structure is an important first step, it is not common to extract only a single value. In real-world data, values in JSON objects are stored as collections. In the image below, the Fortunately, we can extract these values by building on the workflow steps we used to extract a single value from the JSON structure. I won’t list those steps again. Since the list and dictionary data structures are iterable we can use a Python Code: Workflow Steps:
The process of unearthing nested data can at times feel daunting, circuitous and exasperating. It does not lend itself easily to introductory techniques such as iterating thorough |