====== 第二十章：序列化与数据格式 ======

===== 本章目标 =====

完成本章学习后，你将能够：
  * 掌握JSON数据的解析和生成
  * 使用YAML配置文件
  * 使用pickle进行对象序列化
  * 处理CSV和XML数据
  * 选择合适的序列化格式

===== JSON处理 =====

==== json模块 ====

<code python>
import json

# Python对象转JSON字符串
data = {
    "name": "Alice",
    "age": 25,
    "is_student": False,
    "courses": ["Math", "Physics"],
    "address": {"city": "Beijing", "zip": "100000"}
}

json_str = json.dumps(data)
print(json_str)
# {"name": "Alice", "age": 25, "is_student": false, "courses": ["Math", "Physics"]...}

# 格式化输出
print(json.dumps(data, indent=2, ensure_ascii=False))
</code>

==== 文件读写 ====

<code python>
import json

# 写入JSON文件
data = {"users": [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}]}

with open('data.json', 'w', encoding='utf-8') as f:
    json.dump(data, f, indent=2, ensure_ascii=False)

# 读取JSON文件
with open('data.json', 'r', encoding='utf-8') as f:
    loaded_data = json.load(f)

print(loaded_data['users'][0]['name'])  # Alice
</code>

==== 自定义编码器 ====

<code python>
import json
from datetime import datetime

class DateTimeEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        return super().default(obj)

data = {"name": "Event", "time": datetime.now()}
json_str = json.dumps(data, cls=DateTimeEncoder)
print(json_str)
</code>

===== YAML处理 =====

<code python>
# 需要安装: pip install PyYAML
import yaml

# 读取YAML
data = '''
name: Alice
age: 25
address:
  city: Beijing
  street: Main Road
skills:
  - Python
  - JavaScript
'''

parsed = yaml.safe_load(data)
print(parsed['name'])  # Alice
print(parsed['skills'])  # ['Python', 'JavaScript']

# 写入YAML
config = {
    'database': {
        'host': 'localhost',
        'port': 3306,
        'name': 'myapp'
    },
    'debug': True
}

with open('config.yaml', 'w') as f:
    yaml.dump(config, f, default_flow_style=False)
</code>

===== Pickle序列化 =====

<code python>
import pickle

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f"Person({self.name!r}, {self.age})"

# 序列化
person = Person("Alice", 25)
with open('person.pkl', 'wb') as f:
    pickle.dump(person, f)

# 反序列化
with open('person.pkl', 'rb') as f:
    loaded_person = pickle.load(f)

print(loaded_person)  # Person('Alice', 25)
</code>

===== CSV处理 =====

<code python>
import csv

# 写入CSV
with open('data.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    writer.writerow(['Name', 'Age', 'City'])
    writer.writerow(['Alice', '25', 'Beijing'])
    writer.writerow(['Bob', '30', 'Shanghai'])

# 读取CSV
with open('data.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

# 字典方式处理
with open('data.csv', 'r', encoding='utf-8') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['Name']} is {row['Age']} years old")
</code>

===== XML处理 =====

<code python>
import xml.etree.ElementTree as ET

# 解析XML
xml_data = '''
<root>
    <user id="1">
        <name>Alice</name>
        <age>25</age>
    </user>
    <user id="2">
        <name>Bob</name>
        <age>30</age>
    </user>
</root>
'''

root = ET.fromstring(xml_data)

# 遍历
for user in root.findall('user'):
    user_id = user.get('id')
    name = user.find('name').text
    age = user.find('age').text
    print(f"User {user_id}: {name}, {age}")

# 创建XML
new_user = ET.Element('user', {'id': '3'})
ET.SubElement(new_user, 'name').text = 'Charlie'
ET.SubElement(new_user, 'age').text = '35'
root.append(new_user)

# 保存
ET.ElementTree(root).write('output.xml', encoding='utf-8')
</code>

===== 序列化格式对比 =====

| 格式 | 可读性 | 跨语言 | 支持类型 | 适用场景 |
|------|--------|--------|----------|----------|
| JSON | 好 | 是 | 基础类型 | 数据交换、配置 |
| YAML | 很好 | 是 | 基础类型 | 配置文件 |
| Pickle | 否 | 否 | Python任意 | 临时存储、缓存 |
| CSV | 好 | 是 | 表格数据 | 数据表格 |
| XML | 好 | 是 | 层次数据 | 复杂配置、文档 |

===== 本章练习 =====

1. 实现JSON配置文件管理器
2. 将对象列表导出为CSV
3. 实现YAML配置文件读取
4. 比较不同序列化格式的性能

下一章：[[python_course:chapter21|第二十一章：模块系统详解]]