Skip to main content
Python

Python JSON Tips, Tricks, and Pitfalls

7 mins

A cartoon snake getting itself hilariously tangled in curly braces representing JSON data.

Working with JSON in Python #

JSON stands for JavaScript Object Notation. It is a common data format used for storing and exchanging data between systems. It uses a simple, human-readable format consisting of key-value pairs and arrays.

An example of JSON data looks like this:

{
    "name": "John Doe",
    "age": 30,
    "is_student": false,
    "courses": ["Math", "Science", "History"],
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "zip": "12345"
    }
}

No need to install json

Python provides a built-in module called json that allows you to work with JSON data easily. So, there is no need for pip install json or similar commands.

Loading and Saving JSON #

The json module provides functions to load JSON data from a file or a string, and to save Python objects as JSON data.

Loading JSON Data #

Use the json.load() function to load JSON data from a file:

import json

with open('data.json', 'r') as file:
    data = json.load(file)
print(data)

Saving JSON Data #

Use the json.dump() function to save Python objects as JSON data to a file:

import json

data = {"name": "Jane Doe", "age": 25, "is_student": True}
with open('data.json', 'w') as file:
    json.dump(data, file)

JSON dumps() and loads() #

Do not get confused between json.load()/json.dump() and json.loads()/json.dumps().

The json.dumps() and json.loads() functions are for working with JSON data as strings, not files.

Think of dumps and loads as dump s and load s — the “s” stands for string.

These are used when you want to convert between JSON strings and Python objects.

import json

# Convert Python object to JSON string
data = {"name": "John Doe", "age": 25, "is_student": True}
json_string = json.dumps(data)
print(f"JSON is {json_string}")  

# Convert JSON string to Python object
json_string = '{"name": "John Doe", "age": 25, "is_student": true}'
data = json.loads(json_string)
print(f"Object is {data}")

outputs:

JSON is {"name": "John Doe", "age": 25, "is_student": true}
Object is {'name': 'John Doe', 'age': 25, 'is_student': True}

Reading Nested JSON #

Often JSON data can be nested, containing objects within objects or arrays within arrays.

In Python, this translates to dictionaries and lists, containing more dictionaries and lists.

Consider the following JSON data:

{
    "name": "John Doe",
    "age": 30,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "zip": "12345"
    },
    "courses": [
        {"name": "Math", "grade": "A"},
        {"name": "Science", "grade": "B"}
    ]
}

As a Python object, this would be represented as:

dict containing:
- name: string
- age: integer
- address: dict containing:
    - street: string
    - city: string
    - zip: string
- courses: list of dicts containing:
    - name: string
    - grade: string

Accessing Values #

To access the city value from the address object and the name of the first course:

import json

json_data = '''
{
    "name": "John Doe",
    "age": 30,
    "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "zip": "12345"
    },
    "courses": [
        {"name": "Math", "grade": "A"},
        {"name": "Science", "grade": "B"}
    ]
}
''' 

# load JSON data from a string containing the above JSON
data = json.loads(json_data)
# navigate nested dictionaries
city = data['address']['city']

# get first item in courses array, which is a dictionary, then get name key
first_course_name = data['courses'][0]['name']  

Iterating Over Arrays #

JSON arrays are converted to Python lists

for course in data['courses']:
    # each course is a dictionary
    print(f"Course: {course['name']}, Grade: {course['grade']}")

Using Web APIs with JSON #

The requests library is commonly used to interact with web APIs that return JSON data.

However, when accessing the JSON data from a response, use the .json() method provided by the requests library instead of manually parsing the response text.

import requests

response = requests.get('https://api.example.com/data')
# Use the .json() method to parse JSON response
data = response.json()
print(data)

The .json() method automatically handles character encoding and error cases.

Pretty Printing JSON #

When working with JSON data, it is often helpful to format it in a more readable way.

With JSON Module #

The json module provides options for pretty-printing JSON data.

You can use the indent parameter in the json.dump() and json.dumps() functions to specify the number of spaces to use for indentation.

import json

# Pretty-print JSON with 2 spaces indentation
pretty_json = json.dumps(data, indent=2)
print(pretty_json)

With pprint Module #

The pprint module provides a way to pretty-print Python objects, including those converted from JSON. Note that pprint tends to be more compact than the json module’s pretty-printing.

import json
from pprint import pprint

# Pretty-print using pprint with 2 spaces indentation
pprint(data, indent=2)

Common JSON Errors #

When working with JSON data, you may encounter errors. One common error is json.decoder.JSONDecodeError, which occurs when the JSON data is malformed or invalid.

Invalid JSON Syntax #

JSON has strict syntax rules, and invalid syntax will raise a JSONDecodeError. Validate your JSON data using online tools like JSONLint or JSON Formatter to ensure it is well-formed.

Some common syntax issues include:

Trailing Commas #

Ensure that your JSON data follows the correct syntax rules such as proper comma placement, as JSON does not allow trailing commas.

import json

invalid_json = '{"name": "John Doe", "age": 30,}'  # Trailing comma is invalid

try:
    data = json.loads(invalid_json)
except json.decoder.JSONDecodeError as e:
    print(f"JSONDecodeError: {e}")

outputs:

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 32 (char 31)

Single vs Double Quotes #

Another common pitfall related to invalid JSON syntax is that JSON requires double quotes for strings. Using single quotes will result in a JSONDecodeError.

import json

invalid_json = "{'name': 'John Doe', 'age': 30}"  # Single quotes are invalid
try:
    data = json.loads(invalid_json)
except json.decoder.JSONDecodeError as e:
    print(f"JSONDecodeError: {e}")

outputs:

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Reading an Empty File #

An empty file will also raise a JSONDecodeError when you try to load it. The error may be similar to:

Traceback (most recent call last):
  File "/programmerpulse/codesources/python/jsonfeatures.py", line 58, in <module>
    json_data = json.load(open("data/empty.json"))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/__init__.py", line 293, in load
    return loads(fp.read(),
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Use defensive programming to check if the file is empty before loading it, or even exists.

import json

path = "data/empty.json"
try:
    with open(path, "r", encoding="utf-8") as f:
        raw = f.read()

    if not raw.strip():
        file_data = {}  # choose {} as a sensible default for an "empty" JSON document
        print(f"Note: {path} is empty; using an empty object {{}}.")
    else:
        file_data = json.loads(raw)

    pretty_json = json.dumps(file_data, indent=2)
    print(pretty_json)

except FileNotFoundError:
    print(f"FileNotFoundError: {path} does not exist.")
except json.JSONDecodeError as e:
    print(f"JSONDecodeError while parsing {path}: {e}")

outputs:

Note: data/empty.json is empty; using an empty object {}.
{}

Wrong encoding #

When reading JSON files, ensure that the file is encoded in UTF-8. If the file uses a different encoding, you may encounter decoding errors such as UnicodeDecodeError or JSONDecodeError, even if the JSON looks valid in a text editor.

You should specify the correct encoding when opening the file. Using the encoding parameter in the open() function helps diagnose encoding issues.

import json

path = "data/non_utf8.json"
try:
    with open(path, "r", encoding="utf-8") as f:
        data = json.load(f)
except UnicodeDecodeError as e: 
    print(f"Cannot read file due to UnicodeDecodeError: {e}")
except json.JSONDecodeError as e:
    print(f"Cannot parse JSON due to JSONDecodeError: {e}")

UTF-16 Encoding Issues #

Note that for UTF-16, you may encounter a BOM (byte order mark) error, even when specifying encoding="utf-16".

UnicodeError: UTF-16 stream does not start with BOM

This is because UTF-16 encoded files should start with a BOM to indicate endianness (little-endian or big-endian). If the BOM is missing, the UnicodeError above will be raised. Some text editors may not add the BOM when saving files in UTF-16.

To handle UTF-16 files without a BOM, you can use the utf-16-le (little-endian) or utf-16-be (big-endian) encoding values instead to specify encoding explicitly.

    with open(path, "r", encoding="utf-16-le") as f: