Update tech_docs/python/json_python.md
This commit is contained in:
@@ -2,6 +2,287 @@
|
|||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
|
|
||||||
|
JSON (JavaScript Object Notation) is a widely-used data interchange format that is easy to read and write for humans and easy to parse and generate for machines. Python provides several ways to work with JSON, from its built-in `json` library to more advanced external libraries for specific use cases.
|
||||||
|
|
||||||
|
## Built-in `json` Library
|
||||||
|
|
||||||
|
### Basic Operations
|
||||||
|
|
||||||
|
#### Encoding (Serialization)
|
||||||
|
|
||||||
|
Serialization converts a Python object into a JSON string.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import json
|
||||||
|
|
||||||
|
data = {
|
||||||
|
"name": "John Doe",
|
||||||
|
"age": 30,
|
||||||
|
"isEmployed": True,
|
||||||
|
"skills": ["Python", "Machine Learning", "Web Development"]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Convert Python object to JSON string
|
||||||
|
json_string = json.dumps(data, indent=4)
|
||||||
|
print(json_string)
|
||||||
|
|
||||||
|
# Write JSON data to a file
|
||||||
|
with open('data.json', 'w') as file:
|
||||||
|
json.dump(data, file, indent=4)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Decoding (Deserialization)
|
||||||
|
|
||||||
|
Deserialization converts a JSON string back into a Python object.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Convert JSON string to Python object
|
||||||
|
json_string = '{"name": "Jane Doe", "age": 25, "isEmployed": false}'
|
||||||
|
python_object = json.loads(json_string)
|
||||||
|
print(python_object)
|
||||||
|
|
||||||
|
# Read JSON data from a file
|
||||||
|
with open('data.json', 'r') as file:
|
||||||
|
python_object = json.load(file)
|
||||||
|
print(python_object)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Object Encoding and Decoding
|
||||||
|
|
||||||
|
The `json` library can be extended to encode and decode custom Python objects.
|
||||||
|
|
||||||
|
#### Encoding Custom Objects
|
||||||
|
|
||||||
|
```python
|
||||||
|
class User:
|
||||||
|
def __init__(self, name, age):
|
||||||
|
self.name = name
|
||||||
|
self.age = age
|
||||||
|
|
||||||
|
def encode_user(obj):
|
||||||
|
if isinstance(obj, User):
|
||||||
|
return {"name": obj.name, "age": obj.age, "__User__": True}
|
||||||
|
return obj
|
||||||
|
|
||||||
|
user = User("John Doe", 30)
|
||||||
|
json_string = json.dumps(user, default=encode_user)
|
||||||
|
print(json_string)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Decoding JSON into Custom Python Objects
|
||||||
|
|
||||||
|
```python
|
||||||
|
def decode_user(dct):
|
||||||
|
if "__User__" in dct:
|
||||||
|
return User(dct["name"], dct["age"])
|
||||||
|
return dct
|
||||||
|
|
||||||
|
user = json.loads(json_string, object_hook=decode_user)
|
||||||
|
print(user.name, user.age)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Querying JSON
|
||||||
|
|
||||||
|
#### Accessing Nested Elements
|
||||||
|
|
||||||
|
```python
|
||||||
|
json_data = '''
|
||||||
|
{
|
||||||
|
"person": {
|
||||||
|
"name": "John Doe",
|
||||||
|
"address": {
|
||||||
|
"street": "123 Main St",
|
||||||
|
"city": "Anytown",
|
||||||
|
"zipcode": "12345"
|
||||||
|
},
|
||||||
|
"phone_numbers": [
|
||||||
|
{"type": "home", "number": "555-1234"},
|
||||||
|
{"type": "work", "number": "555-5678"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
'''
|
||||||
|
|
||||||
|
data = json.loads(json_data)
|
||||||
|
print(data['person']['name']) # Output: John Doe
|
||||||
|
print(data['person']['address']['city']) # Output: Anytown
|
||||||
|
print(data['person']['phone_numbers'][0]['number']) # Output: 555-1234
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Iterating Over Lists
|
||||||
|
|
||||||
|
```python
|
||||||
|
for phone in data['person']['phone_numbers']:
|
||||||
|
print(f"{phone['type'].capitalize()} phone: {phone['number']}")
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Filtering Data
|
||||||
|
|
||||||
|
```python
|
||||||
|
json_data = '''
|
||||||
|
{
|
||||||
|
"products": [
|
||||||
|
{"name": "Apple", "price": 0.5, "category": "Fruit"},
|
||||||
|
{"name": "Bread", "price": 2.5, "category": "Bakery"},
|
||||||
|
{"name": "Cheese", "price": 5.0, "category": "Dairy"},
|
||||||
|
{"name": "Milk", "price": 3.0, "category": "Dairy"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
'''
|
||||||
|
|
||||||
|
data = json.loads(json_data)
|
||||||
|
expensive_products = [product for product in data['products'] if product['price'] > 2]
|
||||||
|
print("Expensive products:", [product['name'] for product in expensive_products])
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Transforming Data
|
||||||
|
|
||||||
|
```python
|
||||||
|
discounted_products = [
|
||||||
|
{**product, 'discounted_price': product['price'] * 0.9}
|
||||||
|
for product in data['products']
|
||||||
|
]
|
||||||
|
print(json.dumps(discounted_products, indent=2))
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Aggregating Data
|
||||||
|
|
||||||
|
```python
|
||||||
|
total_value = sum(product['price'] for product in data['products'])
|
||||||
|
print(f"Total value of all products: ${total_value:.2f}")
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Searching for Specific Items
|
||||||
|
|
||||||
|
```python
|
||||||
|
product_3_dollars = next((product for product in data['products'] if product['price'] == 3.0), None)
|
||||||
|
print("First $3 product:", product_3_dollars['name'] if product_3_dollars else "Not found")
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Handling Missing Keys
|
||||||
|
|
||||||
|
```python
|
||||||
|
for product in data['products']:
|
||||||
|
print(f"Product: {product.get('name', 'Unnamed')}, "
|
||||||
|
f"Price: ${product.get('price', 0):.2f}, "
|
||||||
|
f"Stock: {product.get('stock', 'Unknown')}")
|
||||||
|
```
|
||||||
|
|
||||||
|
## External Libraries
|
||||||
|
|
||||||
|
### `pandas`
|
||||||
|
|
||||||
|
`pandas` is a powerful data manipulation library that can read JSON into DataFrames, making it easier to manipulate and analyze large datasets.
|
||||||
|
|
||||||
|
#### Reading JSON into a DataFrame
|
||||||
|
|
||||||
|
```python
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
json_data = '''
|
||||||
|
[
|
||||||
|
{"name": "Alice", "age": 30, "city": "New York"},
|
||||||
|
{"name": "Bob", "age": 25, "city": "Los Angeles"},
|
||||||
|
{"name": "Charlie", "age": 35, "city": "Chicago"}
|
||||||
|
]
|
||||||
|
'''
|
||||||
|
|
||||||
|
df = pd.read_json(json_data)
|
||||||
|
print(df)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Writing DataFrame to JSON
|
||||||
|
|
||||||
|
```python
|
||||||
|
df.to_json('output.json', orient='records', indent=4)
|
||||||
|
```
|
||||||
|
|
||||||
|
### `jsonschema`
|
||||||
|
|
||||||
|
`jsonschema` is used for validating JSON data against a schema.
|
||||||
|
|
||||||
|
#### Validating JSON Data
|
||||||
|
|
||||||
|
```python
|
||||||
|
from jsonschema import validate
|
||||||
|
from jsonschema.exceptions import ValidationError
|
||||||
|
|
||||||
|
schema = {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"name": {"type": "string"},
|
||||||
|
"age": {"type": "number"},
|
||||||
|
"city": {"type": "string"}
|
||||||
|
},
|
||||||
|
"required": ["name", "age"]
|
||||||
|
}
|
||||||
|
|
||||||
|
data = {
|
||||||
|
"name": "Alice",
|
||||||
|
"age": 30,
|
||||||
|
"city": "New York"
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
validate(instance=data, schema=schema)
|
||||||
|
print("Valid JSON data")
|
||||||
|
except ValidationError as e:
|
||||||
|
print(f"Invalid JSON data: {e}")
|
||||||
|
```
|
||||||
|
|
||||||
|
### `requests`
|
||||||
|
|
||||||
|
`requests` is a library for making HTTP requests, commonly used to fetch JSON data from APIs.
|
||||||
|
|
||||||
|
#### Fetching JSON Data from an API
|
||||||
|
|
||||||
|
```python
|
||||||
|
import requests
|
||||||
|
|
||||||
|
response = requests.get('https://api.example.com/data')
|
||||||
|
if response.status_code == 200:
|
||||||
|
data = response.json()
|
||||||
|
print(data)
|
||||||
|
else:
|
||||||
|
print(f"Failed to retrieve data: {response.status_code}")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
- **Configuration Files**: JSON is often used to store configuration settings for applications. Its human-readable format makes it easy to update and manage settings.
|
||||||
|
- **Data Interchange**: JSON is a common format for data exchange between servers and web applications, especially in RESTful APIs.
|
||||||
|
- **Storing and Retrieving Data**: JSON can be used to store data persistently in files, which can be later retrieved and processed.
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
- **Handling Exceptions**: Always handle exceptions when parsing JSON to manage malformed data gracefully.
|
||||||
|
|
||||||
|
```python
|
||||||
|
try:
|
||||||
|
data = json.loads(malformed_json_string)
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
print(f"Error decoding JSON: {e}")
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Security Considerations**: Be cautious when deserializing JSON from untrusted sources to avoid security vulnerabilities.
|
||||||
|
|
||||||
|
- **Pretty Printing**: Use the `indent` parameter in `json.dumps()` or `json.dump()` for pretty printing, making JSON data easier to read and debug.
|
||||||
|
|
||||||
|
```python
|
||||||
|
json_string = json.dumps(data, indent=4)
|
||||||
|
print(json_string)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
By leveraging these tools and techniques, you can efficiently work with JSON data in Python, covering a wide range of use cases from basic serialization and deserialization to advanced data manipulation and validation. This guide serves as a comprehensive reference for your JSON handling needs in Python.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# Comprehensive Guide to Working with JSON in Python
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
|
||||||
JSON (JavaScript Object Notation) is a lightweight data interchange format that's easy for humans to read and write and easy for machines to parse and generate. Python's built-in `json` library provides tools to work with JSON data, allowing you to serialize and deserialize Python objects.
|
JSON (JavaScript Object Notation) is a lightweight data interchange format that's easy for humans to read and write and easy for machines to parse and generate. Python's built-in `json` library provides tools to work with JSON data, allowing you to serialize and deserialize Python objects.
|
||||||
|
|
||||||
## Basic Operations
|
## Basic Operations
|
||||||
|
|||||||
Reference in New Issue
Block a user