使用 vllm 本地加载 qwq 模型实现流式对话查看工具调用（Function Calling）

tools = [{"description": "获取指定地点的当前天气","description": "城市和省份，例如：'San Francisco, CA'"},"unit": {},}]定义了一个名为tools的列表，其中包含了一个工具的定义。这个工具是一个函数，名为，用于获取指定地点的当前天气。定义了该函数的参数，包括location（地点）和unit（温度单位），并指定了这些参数的

2401_83655422

1248人浏览 · 2025-05-14 15:37:30

2401_83655422 · 2025-05-14 15:37:30 发布

1. 导入必要的模块

from openai import OpenAI
import json
import time
from datetime import datetime

介绍：

from openai import OpenAI：导入 OpenAI 的客户端库，用于与 OpenAI API 进行交互。
import json：导入 JSON 模块，用于处理 JSON 格式的数据。
import time：导入时间模块，用于测量代码执行的时间。
from datetime import datetime：导入 datetime 模块，用于获取当前时间并格式化为字符串。

2. 辅助函数：打印带时间戳的日志

def log(msg):
    print(f"[{datetime.now().strftime('%H:%M:%S')}] {msg}")

介绍：

定义了一个名为 log 的辅助函数，用于打印带有时间戳的日志信息。
使用 datetime.now() 获取当前时间，并通过 strftime('%H:%M:%S') 将其格式化为 小时:分钟:秒 的形式。
将时间戳和消息拼接后打印出来，方便开发者跟踪代码的执行过程。

3. 创建 OpenAI 客户端

client = OpenAI(base_url="http://localhost:8000/v1", api_key="Abc123")

介绍：

创建了一个 OpenAI 客户端实例 client。
指定了 base_url 和 api_key，base_url 是 OpenAI API 的基础地址，api_key 是用于身份验证的密钥。
注意：这里的 base_url 和 api_key 是示例值，实际使用时需要替换为有效的值。

4. 定义函数工具

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "获取指定地点的当前天气",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "城市和省份，例如：'San Francisco, CA'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"]
                }
            },
            "required": ["location", "unit"]
        }
    }
}]

介绍：

定义了一个名为 tools 的列表，其中包含了一个工具的定义。
这个工具是一个函数，名为 get_weather，用于获取指定地点的当前天气。
定义了该函数的参数，包括 location（地点）和 unit（温度单位），并指定了这些参数的类型和描述。

5. 定义本地函数：模拟返回天气信息

def get_weather(location: str, unit: str):
    return f"☀️ {location} 当前天气：25°{unit.title()}（模拟）"

介绍：

定义了一个名为 get_weather 的本地函数，用于模拟返回天气信息。
在实际项目中，可以使用 requests 模块访问真实的天气 API 来获取天气信息。
这里返回了一个模拟的天气信息，显示地点和温度单位。

6. 函数映射表

tool_functions = {"get_weather": get_weather}

介绍：

创建了一个名为 tool_functions 的字典，用于将工具名称映射到对应的本地函数。
这样，当模型调用某个工具时，可以通过工具名称找到对应的本地函数并执行。

7. 开始执行

start_time = time.time()
log("🚀 开始发送消息到模型...")

介绍：

记录开始时间，用于后续计算代码执行的总耗时。
调用 log 函数打印一条日志，表示开始发送消息到模型。

8. 获取模型 ID

model_id = client.models.list().data[0].id

介绍：

调用 client.models.list() 获取模型列表，并取第一个模型的 ID。
这里假设列表中至少有一个模型，实际使用时需要根据实际情况处理。

9. 发起流式请求

response_stream = client.chat.completions.create(
    model=model_id,
    messages=[{"role": "user", "content": "北京今天的天气咋样"}],
    tools=tools,
    tool_choice="auto",
    stream=True
)

介绍：

调用 client.chat.completions.create() 发起一个流式请求。
指定了模型 ID、用户消息、工具列表、工具选择方式（auto 表示自动选择工具）以及流式响应。
用户消息为 "北京今天的天气咋样"，模型将根据这个消息和工具列表生成响应。

10. 处理流式响应

full_response = ""
tool_call_info = None
log("⏳ 模型返回流式响应中...")
tool_call_accumulator = None  # 累加工具调用的内容

for chunk in response_stream:
    delta = chunk.choices[0].delta

    if hasattr(delta, "content") and delta.content:
        print(delta.content, end="", flush=True)
        full_response += delta.content

    if hasattr(delta, "tool_calls") and delta.tool_calls:
        tool_call_delta = delta.tool_calls[0]
        if tool_call_accumulator is None:
            tool_call_accumulator = {
                "name": "",
                "arguments": ""
            }

        if hasattr(tool_call_delta.function, "name") and tool_call_delta.function.name:
            tool_call_accumulator["name"] = tool_call_delta.function.name

        if hasattr(tool_call_delta.function, "arguments") and tool_call_delta.function.arguments:
            tool_call_accumulator["arguments"] += tool_call_delta.function.arguments

介绍：

初始化了变量 full_response 用于存储完整的响应内容，tool_call_info 用于存储工具调用信息。
遍历流式响应中的每个块（chunk），处理每个块的内容。
如果块中包含普通对话内容（content），则打印并累加到 full_response 中。
如果块中包含工具调用信息（tool_calls），则累加工具调用的内容到 tool_call_accumulator 中。

11. 调用本地函数

if tool_call_accumulator:
    func_name = tool_call_accumulator["name"]
    try:
        arguments = json.loads(tool_call_accumulator["arguments"])
        log(f"➡️ 模型调用的函数名: {func_name}")
        log(f"➡️ 模型传入的参数: {arguments}")

        log(f"⚙️ 开始调用本地函数: {func_name}...")
        result = tool_functions[func_name](**arguments)
        log(f"✅ 函数执行结果: {result}")
    except json.JSONDecodeError as e:
        log("❌ JSON 解析失败，内容不完整")
        log(f"❗ 内容是：{tool_call_accumulator['arguments']}")
        raise e

介绍：

如果有工具调用信息（tool_call_accumulator 不为空），则解析工具调用的内容。
使用 json.loads() 解析工具调用的参数，然后调用对应的本地函数。
如果解析失败（JSONDecodeError），则打印错误信息并抛出异常。

12. 记录结束时间并打印总耗时

end_time = time.time()
log(f"✅ 总耗时：{end_time - start_time:.2f} 秒")

介绍：

记录结束时间，计算并打印代码执行的总耗时。

12. 完整代码

from openai import OpenAI
import json
import time
from datetime import datetime

# 辅助函数：打印带时间戳的日志
def log(msg):
    print(f"[{datetime.now().strftime('%H:%M:%S')}] {msg}")

# 创建 OpenAI 客户端
client = OpenAI(base_url="http://localhost:8000/v1", api_key="Abc123")

# 定义函数工具，描述使用中文
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "获取指定地点的当前天气",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "城市和省份，例如：'San Francisco, CA'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"]
                }
            },
            "required": ["location", "unit"]
        }
    }
}]

# 定义本地函数：模拟返回天气信息
def get_weather(location: str, unit: str):
    # 实际项目中可以使用 requests 访问真实天气API
    return f"☀️ {location} 当前天气：25°{unit.title()}（模拟）"

# 函数映射表
tool_functions = {"get_weather": get_weather}

# ======================= 开始执行 =======================

start_time = time.time()
log("🚀 开始发送消息到模型...")

# 获取模型 ID（取第一个）
model_id = client.models.list().data[0].id

# 发起流式请求
response_stream = client.chat.completions.create(
    model=model_id,
    messages=[{"role": "user", "content": "北京今天的天气咋样"}],
    tools=tools,
    tool_choice="auto",
    stream=True
)

# 变量用于存储完整响应内容
full_response = ""
tool_call_info = None

log("⏳ 模型返回流式响应中...")
tool_call_accumulator = None  # 累加工具调用的内容

for chunk in response_stream:
    # print(chunk)
    delta = chunk.choices[0].delta

    # 普通对话内容
    if hasattr(delta, "content") and delta.content:
        print(delta.content, end="", flush=True)
        full_response += delta.content

    # 工具调用部分
    if hasattr(delta, "tool_calls") and delta.tool_calls:
        tool_call_delta = delta.tool_calls[0]
        if tool_call_accumulator is None:
            # 初始化工具调用对象
            tool_call_accumulator = {
                "name": "",
                "arguments": ""
            }

        if hasattr(tool_call_delta.function, "name") and tool_call_delta.function.name:
            tool_call_accumulator["name"] = tool_call_delta.function.name

        if hasattr(tool_call_delta.function, "arguments") and tool_call_delta.function.arguments:
            tool_call_accumulator["arguments"] += tool_call_delta.function.arguments


if tool_call_accumulator:
    func_name = tool_call_accumulator["name"]
    try:
        arguments = json.loads(tool_call_accumulator["arguments"])
        log(f"➡️ 模型调用的函数名: {func_name}")
        log(f"➡️ 模型传入的参数: {arguments}")

        log(f"⚙️ 开始调用本地函数: {func_name}...")
        result = tool_functions[func_name](**arguments)
        log(f"✅ 函数执行结果: {result}")
    except json.JSONDecodeError as e:
        log("❌ JSON 解析失败，内容不完整")
        log(f"❗ 内容是：{tool_call_accumulator['arguments']}")
        raise e


end_time = time.time()
log(f"✅ 总耗时：{end_time - start_time:.2f} 秒")

魔珐星云具身智能3D数字人开放平台已上线！

电影级数字人，免显卡端渲染SDK，十行代码即可调用，工业级demo免费开源下载！

更多推荐

【DINOv2论文阅读】：无需监督的通用视觉特征提取器——机器人VLA模型的“眼睛“基石

魔珐星云开发社区

收藏！2026年小白程序员必看：AI大模型时代如何精准拿Offer？

魔珐星云开发社区

用户为中心交互系统工程在智能制造系统中应用

用户为中心交互系统工程（User-Centered Interaction System Engineering, UCI-SE）是智能制造与 AI 时代下，重塑传统工业软件（如 MES、ERP、SCADA）和硬件控制终端（如 HMI、具身智能教导盒）的核心设计与工程化方法论。传统工业系统的设计往往是以“技术或设备为中心”，导致界面充满密密麻麻的 PLC 寄存器代码，操作极其繁琐，对人员技能要求极