差别
这里会显示出您选择的修订版和当前版本之间的差别。
| 两侧同时换到之前的修订记录 前一修订版 | |||
| 智能体二次开发:langchain:入门 [2026/05/20 19:01] – 移除 - 外部编辑 (未知日期) 127.0.0.1 | 智能体二次开发:langchain:入门 [2026/05/20 19:01] (当前版本) – ↷ 页面智能体二次开发:入门被移动至智能体二次开发:langchain:入门 张叶安 | ||
|---|---|---|---|
| 行 1: | 行 1: | ||
| + | ====== 第一章:入门基础 ====== | ||
| + | |||
| + | ===== 1.1 LangChain 是什么 ===== | ||
| + | |||
| + | ==== 1.1.1 背景与诞生 ==== | ||
| + | |||
| + | 在2022年底,OpenAI发布ChatGPT之后,大型语言模型(Large Language Models, LLMs)迎来了爆发式增长。开发者们很快发现,虽然这些模型拥有强大的语言理解和生成能力,但将它们集成到实际应用中却面临诸多挑战: | ||
| + | |||
| + | - **上下文管理困难**:LLM是无状态的,无法记住之前的对话内容 | ||
| + | - **数据集成复杂**:如何让模型访问外部数据源(数据库、文档、API等) | ||
| + | - **提示工程繁琐**:编写、管理和优化提示词(Prompt)需要大量经验 | ||
| + | - **链式调用复杂**:复杂的任务需要多个LLM调用协同完成 | ||
| + | - **生产化部署困难**:从原型到生产环境需要解决监控、调试、性能等问题 | ||
| + | |||
| + | 正是为了解决这些问题,LangChain于2022年10月由Harrison Chase创建。它是一个开源的Python(后来也支持JavaScript/ | ||
| + | |||
| + | ==== 1.1.2 LangChain 的核心理念 ==== | ||
| + | |||
| + | LangChain的核心理念可以概括为:**将语言模型作为核心计算单元,通过组件化的方式构建复杂应用**。 | ||
| + | |||
| + | 这个理念包含几个关键要点: | ||
| + | |||
| + | === 1. 组件化设计 === | ||
| + | LangChain将LLM应用所需的功能拆分为独立的、可复用的组件: | ||
| + | * **Models**:与各种LLM的接口 | ||
| + | * **Prompts**:提示词管理 | ||
| + | * **Chains**:将多个组件串联起来 | ||
| + | * **Indexes**:文档索引和检索 | ||
| + | * **Memory**:状态管理 | ||
| + | * **Agents**:智能代理 | ||
| + | |||
| + | === 2. 链式思维 === | ||
| + | 复杂任务往往需要通过多个步骤完成。LangChain提供了" | ||
| + | |||
| + | === 3. 数据感知 === | ||
| + | LLM本身只拥有训练数据中的知识。LangChain通过Indexes和Retrievers,让模型能够访问外部数据源,实现" | ||
| + | |||
| + | === 4. 代理能力 === | ||
| + | 通过Agents,LLM可以自主决策,选择使用哪些工具来完成任务,实现真正的智能化应用。 | ||
| + | |||
| + | ==== 1.1.3 LangChain 的应用场景 ==== | ||
| + | |||
| + | LangChain适用于多种应用场景: | ||
| + | |||
| + | | 应用场景 | 说明 | 典型示例 | | ||
| + | | 问答系统 | 基于文档的智能问答 | 企业内部知识库、客服机器人 | | ||
| + | | 聊天机器人 | 具有记忆的对话系统 | 个人助手、智能客服 | | ||
| + | | 代码助手 | 代码生成与分析 | GitHub Copilot类产品 | | ||
| + | | 数据分析 | 自然语言查询数据 | " | ||
| + | | 内容生成 | 自动化内容创作 | 营销文案、新闻报道 | | ||
| + | | 智能代理 | 自主任务执行 | 自动预订、信息收集 | | ||
| + | | 工作流自动化 | 多步骤业务流程 | 审批流程、数据同步 | | ||
| + | |||
| + | ==== 1.1.4 LangChain 的架构概览 ==== | ||
| + | |||
| + | < | ||
| + | ┌─────────────────────────────────────────────────────────────┐ | ||
| + | │ 应用层 (Applications) | ||
| + | │ ┌──────────┐ | ||
| + | │ │ 聊天机器人│ | ||
| + | │ └──────────┘ | ||
| + | └─────────────────────────────────────────────────────────────┘ | ||
| + | │ | ||
| + | ┌─────────────────────────────────────────────────────────────┐ | ||
| + | │ 链层 (Chains) | ||
| + | │ ┌──────────┐ | ||
| + | │ │ LLM Chain│ | ||
| + | │ └──────────┘ | ||
| + | └─────────────────────────────────────────────────────────────┘ | ||
| + | │ | ||
| + | ┌─────────────────────────────────────────────────────────────┐ | ||
| + | │ | ||
| + | │ ┌──────┐ | ||
| + | │ │Models│ | ||
| + | │ └──────┘ | ||
| + | └─────────────────────────────────────────────────────────────┘ | ||
| + | │ | ||
| + | ┌─────────────────────────────────────────────────────────────┐ | ||
| + | │ 模型层 (LLMs) | ||
| + | │ ┌──────────┐ | ||
| + | │ │ OpenAI | ||
| + | │ └──────────┘ | ||
| + | └─────────────────────────────────────────────────────────────┘ | ||
| + | </ | ||
| + | |||
| + | |||
| + | LangChain的架构分为四个层次: | ||
| + | |||
| + | - **模型层**:与各种LLM提供商的接口 | ||
| + | - **组件层**:构成应用的基本单元 | ||
| + | - **链层**:将组件组合成可执行的工作流 | ||
| + | - **应用层**:面向最终用户的完整应用 | ||
| + | |||
| + | 这种分层架构使得LangChain既灵活又强大:开发者可以从底层组件开始构建,也可以直接使用高层抽象快速开发应用。 | ||
| + | |||
| + | ==== 1.1.5 LangChain 与其他框架的比较 ==== | ||
| + | |||
| + | | 特性 | LangChain | LlamaIndex | Semantic Kernel | Haystack | | ||
| + | | 主要语言 | Python/JS | Python | Python/C# | Python | | ||
| + | | 核心优势 | 通用性、生态 | 检索增强 | 微软生态 | 搜索/NLP | | ||
| + | | 易用性 | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | | ||
| + | | 灵活性 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | | ||
| + | | 企业级 | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | | ||
| + | | 社区活跃度 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | | ||
| + | |||
| + | LangChain的最大优势在于其: | ||
| + | * **丰富的集成**:支持几乎所有主流LLM和工具 | ||
| + | * **活跃的社区**:大量教程、插件和开源项目 | ||
| + | * **快速迭代**:紧跟AI领域最新发展 | ||
| + | * **企业级功能**:LangSmith和LangServe提供完整的生产支持 | ||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 1.2 核心概念详解 ===== | ||
| + | |||
| + | ==== 1.2.1 语言模型(Language Models) ==== | ||
| + | |||
| + | 语言模型是LangChain的核心。在LangChain中,主要有两类模型: | ||
| + | |||
| + | === 1. LLMs(纯文本补全模型) === | ||
| + | |||
| + | 这类模型接收文本输入,输出文本。典型的代表是OpenAI的早期模型如text-davinci-003。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.llms import OpenAI | ||
| + | |||
| + | llm = OpenAI() | ||
| + | response = llm.predict(" | ||
| + | print(response) | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. Chat Models(对话模型) === | ||
| + | |||
| + | 这类模型专为对话场景设计,接收消息列表,输出消息。代表是GPT-4、Claude等现代模型。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.chat_models import ChatOpenAI | ||
| + | from langchain.schema import HumanMessage, | ||
| + | |||
| + | chat = ChatOpenAI() | ||
| + | messages = [ | ||
| + | SystemMessage(content=" | ||
| + | HumanMessage(content=" | ||
| + | ] | ||
| + | response = chat.predict_messages(messages) | ||
| + | print(response.content) | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 3. 文本嵌入模型(Embeddings) === | ||
| + | |||
| + | 用于将文本转换为向量表示,是语义搜索和RAG的基础。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.embeddings import OpenAIEmbeddings | ||
| + | |||
| + | embeddings = OpenAIEmbeddings() | ||
| + | text = " | ||
| + | vector = embeddings.embed_query(text) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.2.2 提示词(Prompts) ==== | ||
| + | |||
| + | 提示词是与语言模型交互的桥梁。LangChain提供了强大的提示词管理功能: | ||
| + | |||
| + | === 1. PromptTemplate(提示词模板) === | ||
| + | |||
| + | 允许创建可复用的、带变量的提示词模板。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.prompts import PromptTemplate | ||
| + | |||
| + | template = """ | ||
| + | 你是一个{role}。请用{style}的风格回答以下问题: | ||
| + | |||
| + | 问题:{question} | ||
| + | """ | ||
| + | |||
| + | prompt = PromptTemplate.from_template(template) | ||
| + | formatted_prompt = prompt.format( | ||
| + | role=" | ||
| + | style=" | ||
| + | question=" | ||
| + | ) | ||
| + | print(formatted_prompt) | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. ChatPromptTemplate(对话提示词模板) === | ||
| + | |||
| + | 专为对话模型设计的模板。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.prompts import ChatPromptTemplate, | ||
| + | |||
| + | system_template = " | ||
| + | system_prompt = SystemMessagePromptTemplate.from_template(system_template) | ||
| + | |||
| + | human_template = " | ||
| + | human_prompt = HumanMessagePromptTemplate.from_template(human_template) | ||
| + | |||
| + | chat_prompt = ChatPromptTemplate.from_messages([system_prompt, | ||
| + | |||
| + | messages = chat_prompt.format_messages(field=" | ||
| + | for msg in messages: | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.2.3 链(Chains) ==== | ||
| + | |||
| + | 链是LangChain的核心抽象,表示将多个组件组合成一个可执行的工作流。 | ||
| + | |||
| + | === 1. 简单链(LLMChain) === | ||
| + | |||
| + | 最基本的链,将提示词模板和语言模型连接起来。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain import LLMChain | ||
| + | from langchain.chat_models import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | |||
| + | llm = ChatOpenAI() | ||
| + | prompt = ChatPromptTemplate.from_template(" | ||
| + | |||
| + | chain = LLMChain(llm=llm, | ||
| + | result = chain.predict(concept=" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. 复杂链 === | ||
| + | |||
| + | 可以组合多个链,实现更复杂的逻辑。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.chains import SimpleSequentialChain | ||
| + | |||
| + | # 第一个链:生成故事标题 | ||
| + | title_prompt = ChatPromptTemplate.from_template(" | ||
| + | title_chain = LLMChain(llm=llm, | ||
| + | |||
| + | # 第二个链:基于标题写故事 | ||
| + | story_prompt = ChatPromptTemplate.from_template(" | ||
| + | story_chain = LLMChain(llm=llm, | ||
| + | |||
| + | # 组合链 | ||
| + | overall_chain = SimpleSequentialChain(chains=[title_chain, | ||
| + | result = overall_chain.run(" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.2.4 索引与检索(Indexes & Retrievers) ==== | ||
| + | |||
| + | 这是实现RAG(检索增强生成)的基础。 | ||
| + | |||
| + | === 1. Document Loader(文档加载器) === | ||
| + | |||
| + | 从各种来源加载文档。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.document_loaders import TextLoader, PyPDFLoader, | ||
| + | |||
| + | # 加载文本文件 | ||
| + | text_loader = TextLoader(" | ||
| + | text_docs = text_loader.load() | ||
| + | |||
| + | # 加载PDF | ||
| + | pdf_loader = PyPDFLoader(" | ||
| + | pdf_docs = pdf_loader.load() | ||
| + | |||
| + | # 加载网页 | ||
| + | web_loader = WebBaseLoader(" | ||
| + | web_docs = web_loader.load() | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. Text Splitter(文本分割器) === | ||
| + | |||
| + | 将长文档分割成适合处理的片段。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.text_splitter import RecursiveCharacterTextSplitter | ||
| + | |||
| + | text_splitter = RecursiveCharacterTextSplitter( | ||
| + | chunk_size=1000, | ||
| + | chunk_overlap=200 | ||
| + | ) | ||
| + | |||
| + | chunks = text_splitter.split_documents(pdf_docs) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 3. Vector Store(向量存储) === | ||
| + | |||
| + | 存储文档的向量表示,支持语义检索。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.vectorstores import Chroma | ||
| + | from langchain.embeddings import OpenAIEmbeddings | ||
| + | |||
| + | embeddings = OpenAIEmbeddings() | ||
| + | vectorstore = Chroma.from_documents( | ||
| + | documents=chunks, | ||
| + | embedding=embeddings, | ||
| + | persist_directory=" | ||
| + | ) | ||
| + | |||
| + | # 检索相似文档 | ||
| + | results = vectorstore.similarity_search(" | ||
| + | for doc in results: | ||
| + | print(doc.page_content[: | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.2.5 记忆(Memory) ==== | ||
| + | |||
| + | 记忆机制让链能够记住之前的交互。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.memory import ConversationBufferMemory | ||
| + | from langchain.chains import ConversationChain | ||
| + | |||
| + | memory = ConversationBufferMemory() | ||
| + | conversation = ConversationChain( | ||
| + | llm=llm, | ||
| + | memory=memory, | ||
| + | verbose=True | ||
| + | ) | ||
| + | |||
| + | # 第一轮对话 | ||
| + | response1 = conversation.predict(input=" | ||
| + | print(response1) | ||
| + | |||
| + | # 第二轮对话 - 模型会记得名字 | ||
| + | response2 = conversation.predict(input=" | ||
| + | print(response2) | ||
| + | |||
| + | # 查看记忆内容 | ||
| + | print(memory.load_memory_variables({})) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.2.6 代理(Agents) ==== | ||
| + | |||
| + | 代理让LLM能够自主决策,使用工具完成任务。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.agents import initialize_agent, | ||
| + | from langchain.tools import DuckDuckGoSearchRun | ||
| + | |||
| + | # 定义工具 | ||
| + | search = DuckDuckGoSearchRun() | ||
| + | tools = [ | ||
| + | Tool( | ||
| + | name=" | ||
| + | func=search.run, | ||
| + | description=" | ||
| + | ) | ||
| + | ] | ||
| + | |||
| + | # 初始化代理 | ||
| + | agent = initialize_agent( | ||
| + | tools=tools, | ||
| + | llm=llm, | ||
| + | agent=" | ||
| + | verbose=True | ||
| + | ) | ||
| + | |||
| + | # 运行代理 | ||
| + | response = agent.run(" | ||
| + | print(response) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 1.3 安装与环境配置 ===== | ||
| + | |||
| + | ==== 1.3.1 环境要求 ==== | ||
| + | |||
| + | * **Python**: 3.8.1 或更高版本(推荐 3.10+) | ||
| + | * **操作系统**: | ||
| + | * **内存**: 建议 8GB+,如果使用本地模型需要更多 | ||
| + | * **网络**: 需要访问OpenAI等API(或使用代理) | ||
| + | |||
| + | ==== 1.3.2 安装 LangChain ==== | ||
| + | |||
| + | === 1. 基础安装 === | ||
| + | |||
| + | <code bash> | ||
| + | # 创建虚拟环境(推荐) | ||
| + | python -m venv langchain_env | ||
| + | source langchain_env/ | ||
| + | # 或 langchain_env\Scripts\activate | ||
| + | |||
| + | # 安装基础包 | ||
| + | pip install langchain | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. 安装特定集成 === | ||
| + | |||
| + | LangChain采用模块化设计,核心包只包含基础功能,特定集成需要单独安装: | ||
| + | |||
| + | <code bash> | ||
| + | # OpenAI 集成 | ||
| + | pip install langchain-openai | ||
| + | |||
| + | # Anthropic (Claude) 集成 | ||
| + | pip install langchain-anthropic | ||
| + | |||
| + | # HuggingFace 集成(本地模型) | ||
| + | pip install langchain-huggingface | ||
| + | |||
| + | # 常用的文档加载器 | ||
| + | pip install langchain-community | ||
| + | |||
| + | # 向量数据库 | ||
| + | pip install chromadb | ||
| + | pip install faiss-cpu | ||
| + | pip install faiss-gpu | ||
| + | pip install pinecone-client | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 3. 完整开发环境安装 === | ||
| + | |||
| + | <code bash> | ||
| + | # 一键安装所有常用依赖 | ||
| + | pip install langchain langchain-openai langchain-community | ||
| + | pip install chromadb faiss-cpu | ||
| + | pip install pypdf unstructured | ||
| + | pip install beautifulsoup4 requests | ||
| + | pip install python-dotenv | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.3.3 API密钥配置 ==== | ||
| + | |||
| + | === 1. 使用环境变量(推荐) === | ||
| + | |||
| + | 创建 `.env` 文件: | ||
| + | |||
| + | <code bash> | ||
| + | # .env | ||
| + | OPENAI_API_KEY=your_openai_api_key_here | ||
| + | ANTHROPIC_API_KEY=your_anthropic_api_key_here | ||
| + | HUGGINGFACE_API_TOKEN=your_hf_token_here | ||
| + | </ | ||
| + | |||
| + | |||
| + | 加载环境变量: | ||
| + | |||
| + | <code python> | ||
| + | from dotenv import load_dotenv | ||
| + | import os | ||
| + | |||
| + | load_dotenv() | ||
| + | |||
| + | openai_api_key = os.getenv(" | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. 直接在代码中设置 === | ||
| + | |||
| + | <code python> | ||
| + | import os | ||
| + | |||
| + | os.environ[" | ||
| + | # 或者 | ||
| + | # os.environ[" | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 3. 传递给类构造函数 === | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | llm = ChatOpenAI(api_key=" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.3.4 验证安装 ==== | ||
| + | |||
| + | 创建一个测试脚本来验证环境: | ||
| + | |||
| + | <code python> | ||
| + | # test_installation.py | ||
| + | import sys | ||
| + | |||
| + | def check_installation(): | ||
| + | """ | ||
| + | | ||
| + | print(" | ||
| + | print(" | ||
| + | print(" | ||
| + | | ||
| + | # 检查Python版本 | ||
| + | print(f" | ||
| + | | ||
| + | # 检查LangChain | ||
| + | try: | ||
| + | import langchain | ||
| + | print(f" | ||
| + | except ImportError: | ||
| + | print(" | ||
| + | | ||
| + | # 检查各种集成 | ||
| + | integrations = [ | ||
| + | (" | ||
| + | (" | ||
| + | (" | ||
| + | ] | ||
| + | | ||
| + | print(" | ||
| + | for module, name in integrations: | ||
| + | try: | ||
| + | __import__(module) | ||
| + | print(f" | ||
| + | except ImportError: | ||
| + | print(f" | ||
| + | | ||
| + | # 检查向量数据库 | ||
| + | vectorstores = [ | ||
| + | (" | ||
| + | (" | ||
| + | ] | ||
| + | | ||
| + | print(" | ||
| + | for module, name in vectorstores: | ||
| + | try: | ||
| + | __import__(module) | ||
| + | print(f" | ||
| + | except ImportError: | ||
| + | print(f" | ||
| + | | ||
| + | # 检查API密钥 | ||
| + | print(" | ||
| + | import os | ||
| + | api_keys = [" | ||
| + | for key in api_keys: | ||
| + | if os.getenv(key): | ||
| + | print(f" | ||
| + | else: | ||
| + | print(f" | ||
| + | | ||
| + | print(" | ||
| + | |||
| + | if __name__ == " | ||
| + | check_installation() | ||
| + | </ | ||
| + | |||
| + | |||
| + | 运行检查: | ||
| + | <code bash> | ||
| + | python test_installation.py | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.3.5 配置国内访问(如需要) ==== | ||
| + | |||
| + | 如果在无法直接访问OpenAPI的地区,可以使用代理: | ||
| + | |||
| + | === 1. 设置代理环境变量 === | ||
| + | |||
| + | <code python> | ||
| + | import os | ||
| + | |||
| + | # 设置代理 | ||
| + | os.environ[" | ||
| + | os.environ[" | ||
| + | |||
| + | # 或者使用OpenAI的base_url指向国内镜像 | ||
| + | os.environ[" | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 2. 在代码中配置 === | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | llm = ChatOpenAI( | ||
| + | base_url=" | ||
| + | api_key=" | ||
| + | ) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 1.4 第一个 Hello World 程序 ===== | ||
| + | |||
| + | ==== 1.4.1 最简单的LangChain程序 ==== | ||
| + | |||
| + | 让我们从最基础的程序开始: | ||
| + | |||
| + | <code python> | ||
| + | # hello_world.py | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | |||
| + | # 1. 创建语言模型实例 | ||
| + | llm = ChatOpenAI( | ||
| + | model=" | ||
| + | temperature=0.7 | ||
| + | ) | ||
| + | |||
| + | # 2. 创建提示词模板 | ||
| + | prompt = ChatPromptTemplate.from_template(" | ||
| + | |||
| + | # 3. 组合成链 | ||
| + | chain = prompt | llm | ||
| + | |||
| + | # 4. 运行 | ||
| + | response = chain.invoke({" | ||
| + | print(response.content) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.4.2 逐步解析 ==== | ||
| + | |||
| + | 让我们详细解析这个程序的每个部分: | ||
| + | |||
| + | === 1. 导入模块 === | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | </ | ||
| + | |||
| + | |||
| + | * `ChatOpenAI`: | ||
| + | * `ChatPromptTemplate`: | ||
| + | |||
| + | === 2. 创建语言模型实例 === | ||
| + | |||
| + | <code python> | ||
| + | llm = ChatOpenAI( | ||
| + | model=" | ||
| + | temperature=0.7 | ||
| + | ) | ||
| + | </ | ||
| + | |||
| + | |||
| + | **参数说明:** | ||
| + | * `model`: 模型标识符,可选值包括: | ||
| + | * `gpt-3.5-turbo`: | ||
| + | * `gpt-4`: 能力更强、成本更高 | ||
| + | * `gpt-4-turbo`: | ||
| + | * `temperature`: | ||
| + | * 0.0: 最确定性,适合需要精确答案的场景 | ||
| + | * 0.7: 平衡的创造性 | ||
| + | * 1.0+: 更具创造性,可能产生意外结果 | ||
| + | * `max_tokens`: | ||
| + | * `api_key`: API密钥(如果没有设置环境变量) | ||
| + | |||
| + | === 3. 创建提示词模板 === | ||
| + | |||
| + | <code python> | ||
| + | prompt = ChatPromptTemplate.from_template(" | ||
| + | </ | ||
| + | |||
| + | |||
| + | 模板中的 `{question}` 是一个变量占位符,运行时会被替换为实际值。 | ||
| + | |||
| + | === 4. 组合链(使用LCEL语法) === | ||
| + | |||
| + | <code python> | ||
| + | chain = prompt | llm | ||
| + | </ | ||
| + | |||
| + | |||
| + | 这是LangChain Expression Language (LCEL)的语法,`|` 操作符表示将前一个组件的输出传递给后一个组件。 | ||
| + | |||
| + | 这行代码等价于: | ||
| + | <code python> | ||
| + | from langchain.chains import LLMChain | ||
| + | |||
| + | chain = LLMChain(prompt=prompt, | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 5. 运行链 === | ||
| + | |||
| + | <code python> | ||
| + | response = chain.invoke({" | ||
| + | </ | ||
| + | |||
| + | |||
| + | * `invoke`: 同步调用链 | ||
| + | * 参数是一个字典,键对应模板中的变量名 | ||
| + | * 返回的是模型的输出 | ||
| + | |||
| + | ==== 1.4.3 添加输出解析器 ==== | ||
| + | |||
| + | 默认情况下,模型的输出是`AIMessage`对象。我们可以添加输出解析器来获得更干净的输出: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | from langchain.schema.output_parser import StrOutputParser | ||
| + | |||
| + | llm = ChatOpenAI(model=" | ||
| + | prompt = ChatPromptTemplate.from_template(" | ||
| + | |||
| + | # 添加字符串输出解析器 | ||
| + | output_parser = StrOutputParser() | ||
| + | |||
| + | # 链:提示词 -> 模型 -> 解析器 | ||
| + | chain = prompt | llm | output_parser | ||
| + | |||
| + | # 运行并获得字符串输出 | ||
| + | result = chain.invoke({" | ||
| + | print(result) | ||
| + | print(type(result)) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.4.4 添加记忆功能 ==== | ||
| + | |||
| + | 让程序记住之前的对话: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.memory import ConversationBufferMemory | ||
| + | from langchain.chains import ConversationChain | ||
| + | |||
| + | llm = ChatOpenAI() | ||
| + | |||
| + | # 创建记忆组件 | ||
| + | memory = ConversationBufferMemory() | ||
| + | |||
| + | # 创建对话链 | ||
| + | conversation = ConversationChain( | ||
| + | llm=llm, | ||
| + | memory=memory | ||
| + | ) | ||
| + | |||
| + | # 多轮对话 | ||
| + | print(" | ||
| + | print(conversation.predict(input=" | ||
| + | |||
| + | print(" | ||
| + | print(conversation.predict(input=" | ||
| + | |||
| + | print(" | ||
| + | print(conversation.predict(input=" | ||
| + | |||
| + | print(" | ||
| + | print(conversation.predict(input=" | ||
| + | |||
| + | # 查看对话历史 | ||
| + | print(" | ||
| + | print(memory.load_memory_variables({})) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.4.5 完整示例:智能问答机器人 ==== | ||
| + | |||
| + | 结合前面学习的知识,创建一个稍微复杂一点的示例: | ||
| + | |||
| + | <code python> | ||
| + | # smart_qa_bot.py | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | from langchain.schema.output_parser import StrOutputParser | ||
| + | from langchain.memory import ConversationBufferMemory | ||
| + | from langchain.schema.runnable import RunnablePassthrough | ||
| + | |||
| + | class SmartQABot: | ||
| + | """ | ||
| + | | ||
| + | def __init__(self): | ||
| + | # 初始化组件 | ||
| + | self.llm = ChatOpenAI(model=" | ||
| + | self.memory = ConversationBufferMemory( | ||
| + | return_messages=True, | ||
| + | memory_key=" | ||
| + | ) | ||
| + | self.output_parser = StrOutputParser() | ||
| + | | ||
| + | # 构建提示词模板 | ||
| + | self.prompt = ChatPromptTemplate.from_messages([ | ||
| + | (" | ||
| + | (" | ||
| + | (" | ||
| + | ]) | ||
| + | | ||
| + | # 构建链 | ||
| + | self.chain = ( | ||
| + | RunnablePassthrough.assign( | ||
| + | history=lambda x: self.memory.load_memory_variables(x)[" | ||
| + | ) | ||
| + | | self.prompt | ||
| + | | self.llm | ||
| + | | self.output_parser | ||
| + | ) | ||
| + | | ||
| + | def chat(self, user_input: str) -> str: | ||
| + | """ | ||
| + | # 获取回复 | ||
| + | response = self.chain.invoke({" | ||
| + | | ||
| + | # 保存对话到记忆 | ||
| + | self.memory.save_context( | ||
| + | {" | ||
| + | {" | ||
| + | ) | ||
| + | | ||
| + | return response | ||
| + | | ||
| + | def get_history(self): | ||
| + | """ | ||
| + | return self.memory.load_memory_variables({}) | ||
| + | |||
| + | # 使用示例 | ||
| + | def main(): | ||
| + | bot = SmartQABot() | ||
| + | | ||
| + | print(" | ||
| + | print(" | ||
| + | print(" | ||
| + | | ||
| + | while True: | ||
| + | user_input = input(" | ||
| + | | ||
| + | if user_input.lower() == ' | ||
| + | print(" | ||
| + | break | ||
| + | | ||
| + | if not user_input: | ||
| + | continue | ||
| + | | ||
| + | try: | ||
| + | response = bot.chat(user_input) | ||
| + | print(f" | ||
| + | except Exception as e: | ||
| + | print(f" | ||
| + | |||
| + | if __name__ == " | ||
| + | main() | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 1.4.6 练习与挑战 ==== | ||
| + | |||
| + | === 练习 1:温度参数实验 === | ||
| + | |||
| + | 创建一个程序,用不同的temperature值生成对同一个问题的回答,观察差异。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | from langchain.schema.output_parser import StrOutputParser | ||
| + | |||
| + | def temperature_experiment(): | ||
| + | prompt = ChatPromptTemplate.from_template(" | ||
| + | temperatures = [0.0, 0.5, 1.0, 1.5] | ||
| + | | ||
| + | for temp in temperatures: | ||
| + | llm = ChatOpenAI(temperature=temp) | ||
| + | chain = prompt | llm | StrOutputParser() | ||
| + | result = chain.invoke({" | ||
| + | print(f" | ||
| + | print(result) | ||
| + | |||
| + | temperature_experiment() | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 练习 2:多语言问候 === | ||
| + | |||
| + | 创建一个能根据用户选择的语言进行问候的程序。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.prompts import ChatPromptTemplate | ||
| + | |||
| + | def multilingual_greeting(): | ||
| + | template = " | ||
| + | prompt = ChatPromptTemplate.from_template(template) | ||
| + | llm = ChatOpenAI() | ||
| + | chain = prompt | llm | ||
| + | | ||
| + | languages = [" | ||
| + | message = " | ||
| + | | ||
| + | for lang in languages: | ||
| + | result = chain.invoke({" | ||
| + | print(f" | ||
| + | |||
| + | multilingual_greeting() | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 练习 3:对话上下文长度限制 === | ||
| + | |||
| + | 修改SmartQABot,实现当对话历史超过一定长度时,自动清理最早的对话。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.memory import ConversationBufferMemory | ||
| + | from langchain.schema.messages import HumanMessage, | ||
| + | |||
| + | class LimitedMemoryBot: | ||
| + | """ | ||
| + | | ||
| + | def __init__(self, | ||
| + | self.llm = ChatOpenAI() | ||
| + | self.max_turns = max_turns | ||
| + | self.memory = ConversationBufferMemory(return_messages=True) | ||
| + | | ||
| + | def chat(self, user_input: str) -> str: | ||
| + | # 获取当前历史 | ||
| + | history = self.memory.load_memory_variables({})[" | ||
| + | | ||
| + | # 如果超过限制,移除最早的对话 | ||
| + | if len(history) >= self.max_turns * 2: # *2 因为每轮有用户和AI两条消息 | ||
| + | history = history[2: | ||
| + | self.memory.clear() | ||
| + | for msg in history: | ||
| + | if isinstance(msg, | ||
| + | self.memory.chat_memory.add_user_message(msg.content) | ||
| + | else: | ||
| + | self.memory.chat_memory.add_ai_message(msg.content) | ||
| + | | ||
| + | # 正常对话流程... | ||
| + | response = self.llm.predict( | ||
| + | f" | ||
| + | ) | ||
| + | | ||
| + | self.memory.save_context( | ||
| + | {" | ||
| + | {" | ||
| + | ) | ||
| + | | ||
| + | return response | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 1.5 本章小结 ===== | ||
| + | |||
| + | ==== 关键知识点回顾 ==== | ||
| + | |||
| + | - **LangChain的定位** | ||
| + | * 用于构建LLM应用的Python/ | ||
| + | * 提供组件化、链式、数据感知、代理能力 | ||
| + | |||
| + | - **核心概念** | ||
| + | * **Models**: LLM、Chat Models、Embeddings | ||
| + | * **Prompts**: | ||
| + | * **Chains**: 组件的组合和工作流 | ||
| + | * **Indexes**: | ||
| + | * **Memory**: 对话历史的保存和管理 | ||
| + | * **Agents**: 智能代理和工具使用 | ||
| + | |||
| + | - **安装配置** | ||
| + | * 基础安装:`pip install langchain` | ||
| + | * 集成包:`langchain-openai`, | ||
| + | * API密钥管理:环境变量是最佳实践 | ||
| + | |||
| + | - **基础编程模式** | ||
| + | * 使用 `|` 操作符组合组件(LCEL) | ||
| + | * 使用 `invoke` 方法运行链 | ||
| + | * 使用 `StrOutputParser` 解析输出 | ||
| + | * 使用 `Memory` 保存对话历史 | ||
| + | |||
| + | ==== 常见错误与解决方案 ==== | ||
| + | |||
| + | | 错误信息 | 原因 | 解决方案 | | ||
| + | | `ModuleNotFoundError` | 缺少依赖 | `pip install` 安装对应包 | | ||
| + | | `AuthenticationError` | API密钥无效 | 检查环境变量或显式传入 | | ||
| + | | `RateLimitError` | 请求频率过高 | 降低调用频率或使用重试机制 | | ||
| + | | `ContextWindowExceeded` | 输入太长 | 缩短输入或使用文本分割器 | | ||
| + | | `ConnectionError` | 网络问题 | 检查代理设置或网络连接 | | ||
| + | |||
| + | ==== 下一步学习建议 ==== | ||
| + | |||
| + | - **深入理解**: | ||
| + | - **动手实践**: | ||
| + | - **预习准备**: | ||
| + | - **阅读文档**: | ||
| + | |||
| + | ==== 本章作业 ==== | ||
| + | |||
| + | - 完成三个练习,并记录你的观察和思考 | ||
| + | - 创建一个能记住用户名字的个性化问候机器人 | ||
| + | - 尝试使用不同的模型(如gpt-4)运行示例,比较输出质量差异 | ||
| + | |||