差别
这里会显示出您选择的修订版和当前版本之间的差别。
| 两侧同时换到之前的修订记录 前一修订版 | |||
| 智能体二次开发:langchain:核心组件详解_models [2026/05/20 19:01] – 移除 - 外部编辑 (未知日期) 127.0.0.1 | 智能体二次开发:langchain:核心组件详解_models [2026/05/20 19:01] (当前版本) – ↷ 页面langchain二次开发:核心组件详解_models被移动至智能体二次开发:langchain:核心组件详解_models 张叶安 | ||
|---|---|---|---|
| 行 1: | 行 1: | ||
| + | ====== 第二章:核心组件详解 - Models ====== | ||
| + | |||
| + | ===== 2.1 模型类型概述 ===== | ||
| + | |||
| + | 在LangChain中,语言模型是最核心的组件。LangChain支持三大类模型: | ||
| + | |||
| + | ==== 2.1.1 LLMs(基础语言模型) ==== | ||
| + | |||
| + | **定义**:接收字符串输入,返回字符串输出的模型。 | ||
| + | |||
| + | **特点**: | ||
| + | * 传统的文本补全接口 | ||
| + | * 通常通过调用 `predict()` 或 `generate()` 方法 | ||
| + | * 适合简单的文本生成任务 | ||
| + | |||
| + | **代表模型**: | ||
| + | * OpenAI: `text-davinci-003`, | ||
| + | * 开源: `LLaMA`, `Mistral`, `Falcon` | ||
| + | |||
| + | **使用场景**: | ||
| + | * 文本补全 | ||
| + | * 简单的文本转换 | ||
| + | * 需要直接控制输入格式的场景 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.llms import OpenAI | ||
| + | |||
| + | # 创建LLM实例 | ||
| + | llm = OpenAI( | ||
| + | model_name=" | ||
| + | temperature=0.7, | ||
| + | max_tokens=256 | ||
| + | ) | ||
| + | |||
| + | # 生成文本 | ||
| + | text = " | ||
| + | result = llm.predict(text) | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.1.2 Chat Models(对话模型) ==== | ||
| + | |||
| + | **定义**:专为对话场景优化的模型,接收消息列表,返回消息。 | ||
| + | |||
| + | **特点**: | ||
| + | * 基于消息角色的架构(System、Human、AI) | ||
| + | * 更好的对话上下文理解 | ||
| + | * 现代主流模型的标准接口 | ||
| + | |||
| + | **代表模型**: | ||
| + | * OpenAI: `gpt-3.5-turbo`, | ||
| + | * Anthropic: `claude-3-opus`, | ||
| + | * 开源: `LLaMA-2-Chat`, | ||
| + | |||
| + | **使用场景**: | ||
| + | * 聊天机器人 | ||
| + | * 多轮对话系统 | ||
| + | * 需要角色设定的场景 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.chat_models import ChatOpenAI | ||
| + | from langchain.schema import HumanMessage, | ||
| + | |||
| + | chat = ChatOpenAI(model=" | ||
| + | |||
| + | messages = [ | ||
| + | SystemMessage(content=" | ||
| + | HumanMessage(content=" | ||
| + | ] | ||
| + | |||
| + | response = chat.predict_messages(messages) | ||
| + | print(response.content) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.1.3 Embedding Models(嵌入模型) ==== | ||
| + | |||
| + | **定义**:将文本转换为高维向量(嵌入)的模型。 | ||
| + | |||
| + | **特点**: | ||
| + | * 输出是数值向量,不是文本 | ||
| + | * 捕捉语义信息 | ||
| + | * 用于相似度计算和检索 | ||
| + | |||
| + | **代表模型**: | ||
| + | * OpenAI: `text-embedding-3-small`, | ||
| + | * 开源: `sentence-transformers`系列 | ||
| + | |||
| + | **使用场景**: | ||
| + | * 语义搜索 | ||
| + | * 文本聚类 | ||
| + | * RAG系统中的文档检索 | ||
| + | |||
| + | <code python> | ||
| + | from langchain.embeddings import OpenAIEmbeddings | ||
| + | |||
| + | embeddings = OpenAIEmbeddings(model=" | ||
| + | |||
| + | # 获取文本的向量表示 | ||
| + | text = " | ||
| + | vector = embeddings.embed_query(text) | ||
| + | print(f" | ||
| + | print(f" | ||
| + | |||
| + | # 批量嵌入 | ||
| + | texts = [" | ||
| + | vectors = embeddings.embed_documents(texts) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.1.4 三种模型的对比 ==== | ||
| + | |||
| + | | 特性 | LLMs | Chat Models | Embeddings | | ||
| + | | 输入 | 字符串 | 消息列表 | 字符串 | | ||
| + | | 输出 | 字符串 | 消息对象 | 向量 | | ||
| + | | 主要用途 | 文本生成 | 对话 | 语义表示 | | ||
| + | | 状态管理 | 无 | 无 | 无 | | ||
| + | | 成本 | 按token计费 | 按token计费 | 通常较低 | | ||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.2 OpenAI 模型集成 ===== | ||
| + | |||
| + | ==== 2.2.1 ChatOpenAI 详解 ==== | ||
| + | |||
| + | `ChatOpenAI` 是与OpenAI聊天模型交互的主要接口。 | ||
| + | |||
| + | === 基础配置参数 === | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | llm = ChatOpenAI( | ||
| + | # 模型选择 | ||
| + | model=" | ||
| + | | ||
| + | # 生成参数 | ||
| + | temperature=0.7, | ||
| + | max_tokens=None, | ||
| + | top_p=1.0, | ||
| + | frequency_penalty=0.0, | ||
| + | presence_penalty=0.0, | ||
| + | | ||
| + | # API配置 | ||
| + | api_key=" | ||
| + | base_url=None, | ||
| + | timeout=None, | ||
| + | max_retries=2, | ||
| + | | ||
| + | # 其他 | ||
| + | streaming=False, | ||
| + | n=1, # 生成结果数量 | ||
| + | stop=None, | ||
| + | ) | ||
| + | </ | ||
| + | |||
| + | |||
| + | === 参数详解 === | ||
| + | |||
| + | **1. temperature(温度)** | ||
| + | |||
| + | 控制输出的随机性: | ||
| + | * `0.0`: 最确定性,总是选择概率最高的token | ||
| + | * `0.7`: 平衡值,适合大多数场景 | ||
| + | * `1.0+`: 更具创造性,输出更多样 | ||
| + | |||
| + | <code python> | ||
| + | def demonstrate_temperature(): | ||
| + | """ | ||
| + | prompt = " | ||
| + | | ||
| + | for temp in [0.0, 0.5, 1.0, 1.5]: | ||
| + | llm = ChatOpenAI(temperature=temp) | ||
| + | result = llm.predict(prompt) | ||
| + | print(f" | ||
| + | print(result) | ||
| + | |||
| + | demonstrate_temperature() | ||
| + | </ | ||
| + | |||
| + | |||
| + | **2. max_tokens(最大token数)** | ||
| + | |||
| + | 限制模型输出的长度: | ||
| + | * 一个token约等于4个英文字符或0.75个单词 | ||
| + | * 中文通常1-2个字符为一个token | ||
| + | * 设置为None时,模型自行决定长度 | ||
| + | |||
| + | <code python> | ||
| + | # 不同max_tokens的效果 | ||
| + | prompt = " | ||
| + | |||
| + | for max_tok in [50, 100, 200]: | ||
| + | llm = ChatOpenAI(max_tokens=max_tok) | ||
| + | result = llm.predict(prompt) | ||
| + | print(f" | ||
| + | print(result[: | ||
| + | </ | ||
| + | |||
| + | |||
| + | **3. frequency_penalty 和 presence_penalty** | ||
| + | |||
| + | 用于减少重复: | ||
| + | * `frequency_penalty`: | ||
| + | * `presence_penalty`: | ||
| + | * 范围都是 -2.0 到 2.0 | ||
| + | |||
| + | <code python> | ||
| + | # 减少重复内容的示例 | ||
| + | prompt = " | ||
| + | |||
| + | llm_low = ChatOpenAI(frequency_penalty=0.0) | ||
| + | llm_high = ChatOpenAI(frequency_penalty=0.8) | ||
| + | |||
| + | print(" | ||
| + | print(llm_low.predict(prompt)) | ||
| + | print(" | ||
| + | print(llm_high.predict(prompt)) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.2.2 调用方法详解 ==== | ||
| + | |||
| + | ChatOpenAI 提供多种调用方式: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.schema import HumanMessage, | ||
| + | |||
| + | chat = ChatOpenAI() | ||
| + | |||
| + | # 方法1: predict - 最简单的方式 | ||
| + | result = chat.predict(" | ||
| + | print(type(result)) | ||
| + | |||
| + | # 方法2: predict_messages - 使用消息对象 | ||
| + | messages = [ | ||
| + | SystemMessage(content=" | ||
| + | HumanMessage(content=" | ||
| + | ] | ||
| + | result = chat.predict_messages(messages) | ||
| + | print(type(result)) | ||
| + | print(result.content) | ||
| + | |||
| + | # 方法3: generate - 批量生成,获取更多元数据 | ||
| + | from langchain.schema import Generation | ||
| + | |||
| + | batch_messages = [ | ||
| + | [HumanMessage(content=" | ||
| + | [HumanMessage(content=" | ||
| + | ] | ||
| + | result = chat.generate(batch_messages) | ||
| + | print(f" | ||
| + | print(f" | ||
| + | |||
| + | # 方法4: async/await - 异步调用 | ||
| + | import asyncio | ||
| + | |||
| + | async def async_chat(): | ||
| + | result = await chat.apredict(" | ||
| + | print(result) | ||
| + | |||
| + | asyncio.run(async_chat()) | ||
| + | |||
| + | # 方法5: streaming - 流式输出 | ||
| + | chat_stream = ChatOpenAI(streaming=True) | ||
| + | for chunk in chat_stream.stream(" | ||
| + | print(chunk.content, | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.2.3 模型选择指南 ==== | ||
| + | |||
| + | OpenAI提供了多个模型,如何选择? | ||
| + | |||
| + | | 模型 | 优点 | 缺点 | 适用场景 | | ||
| + | | gpt-3.5-turbo | 快、便宜 | 能力有限 | 简单任务、原型开发 | | ||
| + | | gpt-4 | 能力强 | 慢、贵 | 复杂推理、代码生成 | | ||
| + | | gpt-4-turbo | 能力最强、支持长文本 | 最贵 | 高级应用、长文档处理 | | ||
| + | | gpt-4o | 多模态、快 | 较新 | 需要图像理解的场景 | | ||
| + | |||
| + | <code python> | ||
| + | def select_model_for_task(task_type: | ||
| + | """ | ||
| + | models = { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | return models.get(task_type, | ||
| + | |||
| + | # 使用示例 | ||
| + | tasks = [" | ||
| + | for task in tasks: | ||
| + | model = select_model_for_task(task) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.2.4 OpenAI Embedding 模型 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import OpenAIEmbeddings | ||
| + | |||
| + | # 选择不同的嵌入模型 | ||
| + | embeddings_small = OpenAIEmbeddings(model=" | ||
| + | embeddings_large = OpenAIEmbeddings(model=" | ||
| + | |||
| + | # 对比不同模型 | ||
| + | test_text = " | ||
| + | |||
| + | vector_small = embeddings_small.embed_query(test_text) | ||
| + | vector_large = embeddings_large.embed_query(test_text) | ||
| + | |||
| + | print(f" | ||
| + | print(f" | ||
| + | |||
| + | # 计算两个句子的相似度 | ||
| + | import numpy as np | ||
| + | |||
| + | def cosine_similarity(a, | ||
| + | return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) | ||
| + | |||
| + | text1 = " | ||
| + | text2 = " | ||
| + | text3 = " | ||
| + | |||
| + | v1 = embeddings_small.embed_query(text1) | ||
| + | v2 = embeddings_small.embed_query(text2) | ||
| + | v3 = embeddings_small.embed_query(text3) | ||
| + | |||
| + | print(f" | ||
| + | print(f"' | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.3 Anthropic Claude 模型集成 ===== | ||
| + | |||
| + | Claude是Anthropic开发的AI助手,以安全性和有用性著称。 | ||
| + | |||
| + | ==== 2.3.1 ChatAnthropic 基础使用 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_anthropic import ChatAnthropic | ||
| + | from langchain.schema import HumanMessage, | ||
| + | |||
| + | # 创建Claude客户端 | ||
| + | claude = ChatAnthropic( | ||
| + | model=" | ||
| + | temperature=0.7, | ||
| + | max_tokens=1024, | ||
| + | anthropic_api_key=" | ||
| + | ) | ||
| + | |||
| + | # 基本对话 | ||
| + | messages = [ | ||
| + | SystemMessage(content=" | ||
| + | HumanMessage(content=" | ||
| + | ] | ||
| + | |||
| + | response = claude.predict_messages(messages) | ||
| + | print(response.content) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.3.2 Claude 模型对比 ==== | ||
| + | |||
| + | | 模型 | 描述 | 最佳用途 | | ||
| + | | claude-3-opus | 最强大 | 复杂推理、数学、编程 | | ||
| + | | claude-3-sonnet | 平衡 | 大多数任务,性价比高 | | ||
| + | | claude-3-haiku | 最快 | 简单任务、实时应用 | | ||
| + | |||
| + | <code python> | ||
| + | def compare_claude_models(): | ||
| + | """ | ||
| + | models = [ | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | ] | ||
| + | | ||
| + | prompt = " | ||
| + | | ||
| + | for model in models: | ||
| + | print(f" | ||
| + | print(f" | ||
| + | print(' | ||
| + | | ||
| + | llm = ChatAnthropic(model=model) | ||
| + | result = llm.predict(prompt) | ||
| + | print(result) | ||
| + | |||
| + | compare_claude_models() | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.3.3 Claude 的特殊功能 ==== | ||
| + | |||
| + | Claude支持一些特殊功能: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_anthropic import ChatAnthropic | ||
| + | |||
| + | claude = ChatAnthropic(model=" | ||
| + | |||
| + | # 1. 长上下文窗口(200K tokens) | ||
| + | long_text = " | ||
| + | response = claude.predict(long_text[: | ||
| + | |||
| + | # 2. 结构化提示 | ||
| + | structured_prompt = """ | ||
| + | Human: 请分析以下产品评价,输出JSON格式: | ||
| + | 评价:" | ||
| + | |||
| + | 请按以下格式输出: | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | {" | ||
| + | ] | ||
| + | } | ||
| + | |||
| + | Assistant:""" | ||
| + | |||
| + | result = claude.predict(structured_prompt) | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.4 本地模型集成 ===== | ||
| + | |||
| + | ==== 2.4.1 使用 HuggingFace 模型 ==== | ||
| + | |||
| + | LangChain可以通过HuggingFace集成各种开源模型。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain_community.llms import HuggingFacePipeline | ||
| + | from transformers import AutoModelForCausalLM, | ||
| + | import torch | ||
| + | |||
| + | # 方法1: 直接使用HuggingFace Pipeline | ||
| + | def load_local_model(): | ||
| + | model_id = " | ||
| + | | ||
| + | tokenizer = AutoTokenizer.from_pretrained(model_id) | ||
| + | model = AutoModelForCausalLM.from_pretrained(model_id) | ||
| + | | ||
| + | pipe = pipeline( | ||
| + | " | ||
| + | model=model, | ||
| + | tokenizer=tokenizer, | ||
| + | max_length=100, | ||
| + | temperature=0.7 | ||
| + | ) | ||
| + | | ||
| + | llm = HuggingFacePipeline(pipeline=pipe) | ||
| + | return llm | ||
| + | |||
| + | # 使用本地模型 | ||
| + | local_llm = load_local_model() | ||
| + | result = local_llm.predict(" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.4.2 使用 llama.cpp / llama-cpp-python ==== | ||
| + | |||
| + | 对于消费级硬件,llama.cpp是很好的选择。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain_community.llms import LlamaCpp | ||
| + | |||
| + | # 加载GGUF格式的模型 | ||
| + | llm = LlamaCpp( | ||
| + | model_path=" | ||
| + | n_ctx=2048, | ||
| + | n_gpu_layers=1, | ||
| + | temperature=0.7, | ||
| + | max_tokens=512, | ||
| + | verbose=True | ||
| + | ) | ||
| + | |||
| + | # 生成文本 | ||
| + | result = llm.predict(" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.4.3 使用 Ollama ==== | ||
| + | |||
| + | Ollama让在本地运行大模型变得非常简单。 | ||
| + | |||
| + | <code python> | ||
| + | from langchain_community.llms import Ollama | ||
| + | |||
| + | # 连接到本地Ollama服务 | ||
| + | ollama = Ollama( | ||
| + | model=" | ||
| + | base_url=" | ||
| + | ) | ||
| + | |||
| + | # 使用 | ||
| + | result = ollama.predict(" | ||
| + | print(result) | ||
| + | |||
| + | # 也可以这样创建 | ||
| + | llm = Ollama(model=" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.4.4 本地嵌入模型 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_community.embeddings import HuggingFaceEmbeddings | ||
| + | |||
| + | # 使用开源嵌入模型 | ||
| + | embeddings = HuggingFaceEmbeddings( | ||
| + | model_name=" | ||
| + | model_kwargs={' | ||
| + | encode_kwargs={' | ||
| + | ) | ||
| + | |||
| + | # 使用 | ||
| + | text = " | ||
| + | vector = embeddings.embed_query(text) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.5 其他模型提供商 ===== | ||
| + | |||
| + | ==== 2.5.1 Azure OpenAI ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import AzureChatOpenAI, | ||
| + | |||
| + | # Azure OpenAI配置 | ||
| + | llm = AzureChatOpenAI( | ||
| + | azure_endpoint=" | ||
| + | azure_deployment=" | ||
| + | openai_api_version=" | ||
| + | openai_api_key=" | ||
| + | ) | ||
| + | |||
| + | embeddings = AzureOpenAIEmbeddings( | ||
| + | azure_endpoint=" | ||
| + | azure_deployment=" | ||
| + | openai_api_version=" | ||
| + | ) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.5.2 Google Vertex AI / Gemini ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_google_vertexai import ChatVertexAI | ||
| + | from langchain_google_genai import ChatGoogleGenerativeAI | ||
| + | |||
| + | # Vertex AI | ||
| + | gemini = ChatVertexAI( | ||
| + | model_name=" | ||
| + | project=" | ||
| + | location=" | ||
| + | ) | ||
| + | |||
| + | # 或使用Google Generative AI | ||
| + | gemini = ChatGoogleGenerativeAI( | ||
| + | model=" | ||
| + | google_api_key=" | ||
| + | ) | ||
| + | |||
| + | result = gemini.predict(" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.5.3 国产大模型 ==== | ||
| + | |||
| + | <code python> | ||
| + | # 文心一言 | ||
| + | from langchain_community.chat_models import QianfanChatEndpoint | ||
| + | |||
| + | wenxin = QianfanChatEndpoint( | ||
| + | qianfan_ak=" | ||
| + | qianfan_sk=" | ||
| + | ) | ||
| + | |||
| + | # 通义千问 | ||
| + | from langchain_community.chat_models import Tongyi | ||
| + | |||
| + | tongyi = Tongyi( | ||
| + | dashscope_api_key=" | ||
| + | ) | ||
| + | |||
| + | # 讯飞星火 | ||
| + | from langchain_community.chat_models import SparkLLM | ||
| + | |||
| + | spark = SparkLLM( | ||
| + | spark_app_id=" | ||
| + | spark_api_key=" | ||
| + | spark_api_secret=" | ||
| + | ) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.6 模型进阶技巧 ===== | ||
| + | |||
| + | ==== 2.6.1 模型降级策略 ==== | ||
| + | |||
| + | 当主要模型不可用时自动切换到备用模型: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain_anthropic import ChatAnthropic | ||
| + | from langchain.schema import BaseMessage | ||
| + | |||
| + | class FallbackLLM: | ||
| + | """ | ||
| + | | ||
| + | def __init__(self, | ||
| + | self.primary = primary | ||
| + | self.fallback = fallback | ||
| + | | ||
| + | def predict(self, | ||
| + | try: | ||
| + | return self.primary.predict(text) | ||
| + | except Exception as e: | ||
| + | print(f" | ||
| + | return self.fallback.predict(text) | ||
| + | | ||
| + | def predict_messages(self, | ||
| + | try: | ||
| + | result = self.primary.predict_messages(messages) | ||
| + | return result.content | ||
| + | except Exception as e: | ||
| + | print(f" | ||
| + | result = self.fallback.predict_messages(messages) | ||
| + | return result.content | ||
| + | |||
| + | # 使用 | ||
| + | primary = ChatOpenAI(model=" | ||
| + | fallback = ChatOpenAI(model=" | ||
| + | llm_with_fallback = FallbackLLM(primary, | ||
| + | |||
| + | result = llm_with_fallback.predict(" | ||
| + | print(result) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.6.2 模型路由器 ==== | ||
| + | |||
| + | 根据输入自动选择最合适的模型: | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | class ModelRouter: | ||
| + | """ | ||
| + | | ||
| + | def __init__(self): | ||
| + | self.models = { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | | ||
| + | self.routing_prompt = """ | ||
| + | 只输出类别词。 | ||
| + | |||
| + | 查询: {query} | ||
| + | 类别:""" | ||
| + | self.classifier = ChatOpenAI(model=" | ||
| + | | ||
| + | def classify(self, | ||
| + | result = self.classifier.predict(self.routing_prompt.format(query=query)) | ||
| + | return result.strip().lower() | ||
| + | | ||
| + | def predict(self, | ||
| + | category = self.classify(query) | ||
| + | print(f" | ||
| + | | ||
| + | model = self.models.get(category, | ||
| + | return model.predict(query) | ||
| + | |||
| + | # 使用 | ||
| + | router = ModelRouter() | ||
| + | |||
| + | queries = [ | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | ] | ||
| + | |||
| + | for q in queries: | ||
| + | print(f" | ||
| + | result = router.predict(q) | ||
| + | print(f" | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.6.3 批量请求优化 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from langchain.schema import HumanMessage | ||
| + | import asyncio | ||
| + | |||
| + | async def batch_predict(queries: | ||
| + | """ | ||
| + | llm = ChatOpenAI() | ||
| + | | ||
| + | results = [] | ||
| + | for i in range(0, len(queries), | ||
| + | batch = queries[i: | ||
| + | | ||
| + | # 准备消息 | ||
| + | messages_list = [[HumanMessage(content=q)] for q in batch] | ||
| + | | ||
| + | # 批量生成 | ||
| + | batch_results = await llm.agenerate(messages_list) | ||
| + | | ||
| + | for gen in batch_results.generations: | ||
| + | results.append(gen[0].text) | ||
| + | | ||
| + | return results | ||
| + | |||
| + | # 使用示例 | ||
| + | async def main(): | ||
| + | queries = [f" | ||
| + | results = await batch_predict(queries) | ||
| + | | ||
| + | for q, r in zip(queries, | ||
| + | print(f" | ||
| + | |||
| + | asyncio.run(main()) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.6.4 缓存机制 ==== | ||
| + | |||
| + | 避免重复调用API,节省成本和延迟: | ||
| + | |||
| + | <code python> | ||
| + | from langchain.globals import set_llm_cache | ||
| + | from langchain.cache import InMemoryCache, | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | # 内存缓存 | ||
| + | set_llm_cache(InMemoryCache()) | ||
| + | |||
| + | # 或持久化缓存 | ||
| + | # set_llm_cache(SQLiteCache(database_path=" | ||
| + | |||
| + | llm = ChatOpenAI() | ||
| + | |||
| + | # 第一次调用,会访问API | ||
| + | result1 = llm.predict(" | ||
| + | print(" | ||
| + | |||
| + | # 第二次调用同样的输入,直接从缓存返回 | ||
| + | result2 = llm.predict(" | ||
| + | print(" | ||
| + | |||
| + | # 清除缓存 | ||
| + | # from langchain.globals import get_llm_cache | ||
| + | # get_llm_cache().clear() | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.7 最佳实践 ===== | ||
| + | |||
| + | ==== 2.7.1 API密钥管理 ==== | ||
| + | |||
| + | <code python> | ||
| + | # config.py - 集中管理配置 | ||
| + | import os | ||
| + | from dotenv import load_dotenv | ||
| + | |||
| + | load_dotenv() | ||
| + | |||
| + | class Config: | ||
| + | OPENAI_API_KEY = os.getenv(" | ||
| + | ANTHROPIC_API_KEY = os.getenv(" | ||
| + | | ||
| + | # 模型默认配置 | ||
| + | DEFAULT_MODEL = " | ||
| + | DEFAULT_TEMPERATURE = 0.7 | ||
| + | | ||
| + | @classmethod | ||
| + | def validate(cls): | ||
| + | """ | ||
| + | if not cls.OPENAI_API_KEY: | ||
| + | raise ValueError(" | ||
| + | |||
| + | # 使用 | ||
| + | from config import Config | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | Config.validate() | ||
| + | llm = ChatOpenAI(api_key=Config.OPENAI_API_KEY) | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.7.2 错误处理 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | from openai import RateLimitError, | ||
| + | import time | ||
| + | |||
| + | def robust_predict(llm, | ||
| + | """ | ||
| + | | ||
| + | for attempt in range(max_retries): | ||
| + | try: | ||
| + | return llm.predict(prompt) | ||
| + | except RateLimitError: | ||
| + | wait_time = 2 ** attempt | ||
| + | print(f" | ||
| + | time.sleep(wait_time) | ||
| + | except AuthenticationError as e: | ||
| + | print(f" | ||
| + | raise | ||
| + | except Exception as e: | ||
| + | print(f" | ||
| + | if attempt == max_retries - 1: | ||
| + | raise | ||
| + | | ||
| + | return None | ||
| + | |||
| + | # 使用 | ||
| + | llm = ChatOpenAI() | ||
| + | result = robust_predict(llm, | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 2.7.3 成本监控 ==== | ||
| + | |||
| + | <code python> | ||
| + | from langchain.callbacks import get_openai_callback | ||
| + | from langchain_openai import ChatOpenAI | ||
| + | |||
| + | def track_cost(llm, | ||
| + | """ | ||
| + | | ||
| + | with get_openai_callback() as cb: | ||
| + | results = [] | ||
| + | for prompt in prompts: | ||
| + | result = llm.predict(prompt) | ||
| + | results.append(result) | ||
| + | | ||
| + | print(f" | ||
| + | print(f" | ||
| + | print(f" | ||
| + | print(f" | ||
| + | print(f" | ||
| + | | ||
| + | return results | ||
| + | |||
| + | # 使用 | ||
| + | llm = ChatOpenAI(model=" | ||
| + | prompts = [" | ||
| + | results = track_cost(llm, | ||
| + | </ | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | ===== 2.8 本章小结 ===== | ||
| + | |||
| + | ==== 核心概念回顾 ==== | ||
| + | |||
| + | - **三种模型类型** | ||
| + | * LLMs: 文本补全模型 | ||
| + | * Chat Models: 对话模型(最常用) | ||
| + | * Embeddings: 文本向量模型 | ||
| + | |||
| + | - **主要集成** | ||
| + | * OpenAI: gpt-3.5-turbo, | ||
| + | * Anthropic: Claude 3系列 | ||
| + | * 本地: HuggingFace, | ||
| + | * 云服务: Azure, Vertex AI | ||
| + | |||
| + | - **关键参数** | ||
| + | * temperature: | ||
| + | * max_tokens: 控制输出长度 | ||
| + | * frequency/ | ||
| + | |||
| + | - **进阶技巧** | ||
| + | * 模型降级策略 | ||
| + | * 智能路由 | ||
| + | * 批量请求 | ||
| + | * 结果缓存 | ||
| + | |||
| + | ==== 选择决策树 ==== | ||
| + | |||
| + | < | ||
| + | 选择什么模型? | ||
| + | ├── 需要最高质量? | ||
| + | │ | ||
| + | │ | ||
| + | ├── 成本敏感? | ||
| + | │ | ||
| + | │ | ||
| + | ├── 数据隐私要求高? | ||
| + | │ | ||
| + | │ | ||
| + | └── 需要超长上下文? | ||
| + | ├── 是 → Claude-3(200K) / GPT-4-turbo(128K) | ||
| + | └── 否 → 其他模型 | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== 作业 ==== | ||
| + | |||
| + | - 实现一个模型对比工具,对同一个问题比较不同模型的回答 | ||
| + | - 为你的应用设计一个智能模型路由系统 | ||
| + | - 实现一个带成本预算限制的LLM调用器 | ||
| + | - 测试本地模型的部署和调用 | ||
| + | |||
| + | |||
| + | |||