第二十章:序列化与数据格式
本章目标
完成本章学习后,你将能够:
- 掌握JSON数据的解析和生成
- 使用YAML配置文件
- 使用pickle进行对象序列化
- 处理CSV和XML数据
- 选择合适的序列化格式
JSON处理
json模块
import json # Python对象转JSON字符串 data = { "name": "Alice", "age": 25, "is_student": False, "courses": ["Math", "Physics"], "address": {"city": "Beijing", "zip": "100000"} } json_str = json.dumps(data) print(json_str) # {"name": "Alice", "age": 25, "is_student": false, "courses": ["Math", "Physics"]...} # 格式化输出 print(json.dumps(data, indent=2, ensure_ascii=False))
文件读写
import json # 写入JSON文件 data = {"users": [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}]} with open('data.json', 'w', encoding='utf-8') as f: json.dump(data, f, indent=2, ensure_ascii=False) # 读取JSON文件 with open('data.json', 'r', encoding='utf-8') as f: loaded_data = json.load(f) print(loaded_data['users'][0]['name']) # Alice
自定义编码器
import json from datetime import datetime class DateTimeEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, datetime): return obj.isoformat() return super().default(obj) data = {"name": "Event", "time": datetime.now()} json_str = json.dumps(data, cls=DateTimeEncoder) print(json_str)
YAML处理
# 需要安装: pip install PyYAML import yaml # 读取YAML data = ''' name: Alice age: 25 address: city: Beijing street: Main Road skills: - Python - JavaScript ''' parsed = yaml.safe_load(data) print(parsed['name']) # Alice print(parsed['skills']) # ['Python', 'JavaScript'] # 写入YAML config = { 'database': { 'host': 'localhost', 'port': 3306, 'name': 'myapp' }, 'debug': True } with open('config.yaml', 'w') as f: yaml.dump(config, f, default_flow_style=False)
Pickle序列化
import pickle class Person: def __init__(self, name, age): self.name = name self.age = age def __repr__(self): return f"Person({self.name!r}, {self.age})" # 序列化 person = Person("Alice", 25) with open('person.pkl', 'wb') as f: pickle.dump(person, f) # 反序列化 with open('person.pkl', 'rb') as f: loaded_person = pickle.load(f) print(loaded_person) # Person('Alice', 25)
CSV处理
import csv # 写入CSV with open('data.csv', 'w', newline='', encoding='utf-8') as f: writer = csv.writer(f) writer.writerow(['Name', 'Age', 'City']) writer.writerow(['Alice', '25', 'Beijing']) writer.writerow(['Bob', '30', 'Shanghai']) # 读取CSV with open('data.csv', 'r', encoding='utf-8') as f: reader = csv.reader(f) for row in reader: print(row) # 字典方式处理 with open('data.csv', 'r', encoding='utf-8') as f: reader = csv.DictReader(f) for row in reader: print(f"{row['Name']} is {row['Age']} years old")
XML处理
import xml.etree.ElementTree as ET # 解析XML xml_data = ''' <root> <user id="1"> <name>Alice</name> <age>25</age> </user> <user id="2"> <name>Bob</name> <age>30</age> </user> </root> ''' root = ET.fromstring(xml_data) # 遍历 for user in root.findall('user'): user_id = user.get('id') name = user.find('name').text age = user.find('age').text print(f"User {user_id}: {name}, {age}") # 创建XML new_user = ET.Element('user', {'id': '3'}) ET.SubElement(new_user, 'name').text = 'Charlie' ET.SubElement(new_user, 'age').text = '35' root.append(new_user) # 保存 ET.ElementTree(root).write('output.xml', encoding='utf-8')
序列化格式对比
| 格式 | 可读性 | 跨语言 | 支持类型 | 适用场景 |
| —— | ——– | ——– | ———- | ———- |
| JSON | 好 | 是 | 基础类型 | 数据交换、配置 |
| YAML | 很好 | 是 | 基础类型 | 配置文件 |
| Pickle | 否 | 否 | Python任意 | 临时存储、缓存 |
| CSV | 好 | 是 | 表格数据 | 数据表格 |
| XML | 好 | 是 | 层次数据 | 复杂配置、文档 |
本章练习
1. 实现JSON配置文件管理器 2. 将对象列表导出为CSV 3. 实现YAML配置文件读取 4. 比较不同序列化格式的性能
下一章:第二十一章:模块系统详解