# https://github.com/TaskingAI/TaskingAI 项目说明书

生成时间：2026-06-24 18:32:47 UTC

## 目录

- [TaskingAI 概览与系统架构](#page-1)
- [后端服务、数据库与 REST API](#page-2)
- [推理服务与多模型提供商集成](#page-3)
- [插件系统、前端与部署运维](#page-4)

<a id='page-1'></a>

## TaskingAI 概览与系统架构

### 相关页面

相关主题：[后端服务、数据库与 REST API](#page-2), [推理服务与多模型提供商集成](#page-3), [插件系统、前端与部署运维](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)
- [backend/app/models/assistant/assistant.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/assistant.py)
- [backend/app/models/assistant/chat.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/chat.py)
- [backend/app/models/assistant/message.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/message.py)
- [backend/app/models/model/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)
- [backend/app/models/model/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py)
- [backend/app/models/retrieval/collection.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/collection.py)
- [backend/app/models/retrieval/chunk.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/chunk.py)
- [backend/app/models/tool/plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)
- [inference/app/models/chat_completion/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/chat_completion/model.py)
- [inference/app/models/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)
- [inference/app/cache/rerank.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)
- [inference/app/routes/text_embedding/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py)
- [inference/app/models/model_config/resources/i18n/en.yml](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/model_config/resources/i18n/en.yml)
</details>

# TaskingAI 概览与系统架构

## 项目定位与核心能力

TaskingAI 是一个面向大语言模型（LLM）应用的 BaaS（Backend-as-a-Service）平台，主仓库采用一站式 Docker 部署形态，将模型管理、工具（Tools）、检索增强生成（RAG）、助手（Assistants）以及会话（Chats）整合到同一套统一 API 之下。README 中将自身定位为"All-In-One LLM Platform"，并明确区分与 LangChain（无状态、依赖外部向量存储）和 OpenAI Assistant API（功能耦合、仅限单一模型提供商）的差异，主打解耦的多租户能力与统一 API。

资料来源：[README.md]()

核心能力可归纳为四点：

1. **多模型统一接入**：通过 Provider/ModelSchema 抽象层支持 OpenAI、Anthropic 等多家厂商，以及 Ollama、LM Studio、Local AI 等本地化部署形态。
2. **模块化解耦**：Tools、RAG Collections、Language Models 均可独立管理并自由组合到 Assistant。
3. **有状态与无状态双模式**：既可托管完整的助手会话与消息历史，也可执行一次性 Chat Completion 请求（v0.3.0 引入 OpenAI 兼容接口）。
4. **异步 FastAPI 后端**：自描述为"基于 Python FastAPI 的异步特性实现高并发计算"。

## 系统分层与部署形态

仓库在工程结构上划分为两层独立服务：**backend** 与 **inference**。Backend 负责租户、Assistant、Chat、Message、Collection、Chunk、Tool/Plugin 等领域对象与持久化逻辑；inference 专注于将请求路由到具体 Provider 的模型实现（chat_completion、text_embedding、rerank 等）。README 中推荐通过 docker-compose 启动整套服务，默认控制台入口为 `http://localhost:8080`，初始凭据 `admin / TaskingAI321`。

资料来源：[README.md](), [inference/app/cache/rerank.py]()

下图展示了从客户端到模型提供方的请求路径：

```mermaid
flowchart LR
  Client[Python SDK / REST] --> Console[TaskingAI Console :8080]
  Console --> Backend[backend FastAPI]
  Backend -->|assistant/chat/tool/retrieval| DB[(Postgres / Redis)]
  Backend --> Inference[inference FastAPI]
  Inference -->|chat_completion| Provider[OpenAI / Anthropic / Custom Host]
  Inference -->|text_embedding| Embed[Embedding Provider]
  Inference -->|rerank| Rerank[Rerank Provider]
```

推理层在启动时会扫描 `providers/` 目录并通过反射加载模型类。例如 `inference/app/cache/rerank.py` 中的 `get_provider_model_class` 会将 `provider_id` 转换为驼峰类名并 `importlib` 导入对应模块，实现"插件式"扩展。

## 核心数据模型

Backend 服务的领域对象使用 Pydantic + tkhelper 的 `ModelEntity` 基类建模。下表汇总了关键实体及其职责：

| 实体 | 模块路径 | 主要字段 | 作用 |
| --- | --- | --- | --- |
| Assistant | `backend/app/models/assistant/assistant.py` | `model_id`、`memory`、`tools`、`retrievals`、`system_prompt_template` | 聚合模型、工具与检索，是 Agent 编排的顶层对象 |
| Chat | `backend/app/models/assistant/chat.py` | `chat_id`、`memory`、`metadata` | 持久化单个会话的上下文与记忆 |
| Message | `backend/app/models/assistant/message.py` | `role`、`content`、`num_tokens`、`MessageGenerationLog` | 用户/助手消息，支持生成日志追踪 |
| Collection | `backend/app/models/retrieval/collection.py` | `embedding_model_id`、`capacity`、`status` | RAG 检索集合，含容量限制与生命周期状态 |
| Chunk | `backend/app/models/retrieval/chunk.py` | `record_id`、`content`、`score` | 检索结果片段，向量召回单元 |
| Model | `backend/app/models/model/model.py` | `provider_id`、`provider_model_id`、`fallbacks` | 已注册的模型实例，支持回退链 |
| Provider | `backend/app/models/model/provider.py` | `credentials_schema`、`model_types`、`resources` | 模型提供方元数据与凭据约束 |
| Plugin | `backend/app/models/tool/plugin.py` | `bundle_id`、`input_schema`、`output_schema` | 工具/插件描述，自动转换为 ChatCompletionFunction |

资料来源：[backend/app/models/assistant/assistant.py](), [backend/app/models/assistant/chat.py](), [backend/app/models/assistant/message.py](), [backend/app/models/retrieval/collection.py](), [backend/app/models/retrieval/chunk.py](), [backend/app/models/model/model.py](), [backend/app/models/model/provider.py](), [backend/app/models/tool/plugin.py]()

Assistant 通过 `tools: List[ToolRef]` 与 `retrievals: List[RetrievalRef]` 引用外部模块，而 Model 上的 `fallbacks` 字段则用于当主模型失败时进行自动回退；Collection 的 `has_available_capacity` 方法在写入前检查余量，避免超额写入。

## 模型配置与多语言描述

`inference/app/models/model_config/resources/i18n/en.yml` 为模型参数（temperature、top_p、max_tokens、frequency_penalty 等）提供 i18n 名称与说明，供前端按语言展示。`inference/app/routes/text_embedding/schema.py` 中 `TextEmbeddingRequest` 同时支持 `proxy`（必须以 `https://` 开头，参见 `inference/app/models/chat_completion/model.py` 中的校验）、`custom_headers`（最多 16 对键值）、`credentials` 与 `encrypted_credentials` 两套凭据方式，体现出"多租户 + 代理 + 自定义头"的灵活接入能力。Provider 描述中 `enable_proxy`、`enable_custom_headers`、`return_token_usage` 等开关则由 `inference/app/models/provider.py` 中的 `Provider` 模型集中维护。

资料来源：[inference/app/routes/text_embedding/schema.py](), [inference/app/models/model_config/resources/i18n/en.yml](), [inference/app/models/chat_completion/model.py](), [inference/app/models/provider.py]()

## 社区关注热点与已知问题

- **自定义模型 URL**：用户长期请求为 OpenAI 等模型开放自定义 URL（issue #11），官方通过 `custom_host` Provider 与 `proxy` 字段部分回应。
- **代理与用量管理**：issue #105 提议加入 token 用量缓存与代理集成。
- **本地 LLM 文档**：issue #58 指出 LM Studio 在 Docker 内难以加载 NVIDIA 驱动，建议将本地 LLM 运行在宿主机、TaskingAI 运行在容器中。
- **custom_host 工具调用**：issue #366 报告当 `custom_host` Provider 同时传入 `functions` 与 `tool_calls` 时报错，错误信息为 `Invalid parameter: 'tool_calls' cannot be used when 'functions' are present. Please use 'tools' instead of 'functions'`。
- **DeepSeek 模型创建**：issue #365 报告后端 Web 在创建 DeepSeek 基础模型时出现错误，已关闭；issue #364 进一步请求支持 OpenAI O3-mini 与 DeepSeek R1。
- **首次登录 404**：issue #370 反馈 Docker 启动后登录阶段出现 `POST /api/v1/admins/login` 的 404。
- **安全告警**：issue #374、#375 报告 DALL-E 3 图像工具 `save_url_image` 与 QR Code 生成器 `save_base64_image` 通过 `project_id` 参数存在路径穿越漏洞，可在服务器任意位置写文件。
- **离线运行**：issue #363 报告离线环境下 plugin、inference、server、nginx 容器自动退出，疑与外部网络校验有关。

资料来源：[inference/app/models/chat_completion/model.py](), [README.md]()

---

## See Also

- 模型与 Provider 接入详解
- Assistant / Chat / Message 生命周期
- 检索增强（RAG）Collection 与 Chunk
- 插件与工具调用协议

---

<a id='page-2'></a>

## 后端服务、数据库与 REST API

### 相关页面

相关主题：[TaskingAI 概览与系统架构](#page-1), [推理服务与多模型提供商集成](#page-3), [插件系统、前端与部署运维](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)
- [backend/app/models/assistant/assistant.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/assistant.py)
- [backend/app/models/assistant/message.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/message.py)
- [backend/app/models/model/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)
- [backend/app/models/model/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py)
- [backend/app/models/tool/plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)
- [backend/app/models/retrieval/chunk.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/chunk.py)
- [inference/app/models/__init__.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/__init__.py)
- [inference/app/models/base.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/base.py)
- [inference/app/models/chat_completion/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/chat_completion/model.py)
- [inference/app/models/text_embedding/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/text_embedding/model.py)
- [inference/app/models/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)
- [inference/app/routes/text_embedding/route.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/route.py)
- [inference/app/routes/text_embedding/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py)
- [inference/app/routes/rerank/route.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/route.py)
- [inference/app/routes/rerank/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/schema.py)
- [inference/app/cache/rerank.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)
</details>

# 后端服务、数据库与 REST API

## 系统概览与架构

TaskingAI 采用 BaaS（Backend-as-a-Service）形态，将面向客户端的 RESTful API 与面向模型供应商的推理适配层解耦，部署时通常以 Docker Compose 编排多容器：包含数据库、`backend`（FastAPI 主服务）、`inference`（独立推理服务）、`console`（管理控制台）以及反向代理 [`README.md`](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)。`backend` 服务承担模型/工具/助手/会话/检索等核心实体的增删改查，并对外暴露 REST API；`inference` 服务则负责把抽象的 chat completion / text embedding / rerank 请求翻译成对应 OpenAI、Anthropic、自定义 host 等提供商的真实调用 [`inference/app/models/chat_completion/model.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/chat_completion/model.py)。

```mermaid
flowchart LR
    Client[Console / SDK / OpenAI 兼容客户端]
    NGINX[nginx 反向代理]
    Backend["backend (FastAPI)<br/>路由 + 服务层 + 数据访问"]
    Inference["inference (FastAPI)<br/>chat_completion / text_embedding / rerank"]
    DB[(PostgreSQL / Redis)]
    Providers[外部 LLM 提供商<br/>OpenAI / Anthropic / Ollama / 自定义 host]

    Client --> NGINX --> Backend
    Backend --> DB
    Backend --> Inference
    Inference --> Providers
    Providers --> Inference --> Backend --> Client
```

## 数据模型与持久化

`backend/app/models` 下的实体全部继承自 `tkhelper.models.ModelEntity`，使用 Pydantic 进行强类型校验，并提供 `build(row)` 工厂方法把数据库行装配为领域对象 [`backend/app/models/assistant/assistant.py`](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/assistant.py)。核心实体可归纳如下：

| 实体 | 文件 | 关键字段 | 用途 |
|---|---|---|---|
| Assistant | `backend/app/models/assistant/assistant.py` | `assistant_id`、`model_id`、`system_prompt_template`、`memory`、`tools`、`retrievals` | 封装可复用的智能体配置 |
| Message | `backend/app/models/assistant/message.py` | `MessageRole`（user/assistant）、`MessageGenerationLog` | 会话消息与生成日志 |
| Model | `backend/app/models/model/model.py` | `model_id`、`provider_id`、`provider_model_id`、`encrypted_credentials`、`fallbacks` | 用户创建的模型实例 |
| Provider | `backend/app/models/model/provider.py` | `provider_id`、`credentials_schema`、`model_types` | 模型供应商元数据 |
| Plugin | `backend/app/models/tool/plugin.py` | `bundle_id`、`plugin_id`、`input_schema`、`output_schema` | 内置/自定义工具描述 |
| Chunk | `backend/app/models/retrieval/chunk.py` | `chunk_id`、`record_id`、`collection_id`、`num_tokens`、`score` | RAG 检索结果片段 |

凭证字段统一命名为 `encrypted_credentials`，并通过 `credentials` 或 `encrypted_credentials` 二选一传入，避免明文密钥经网络传输，例如 [`inference/app/routes/text_embedding/schema.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py) 即要求二者其一存在。Provider 模型同时记录 `model_types`（如 `chat_completion`、`text_embedding`、`rerank`、`wildcard`），`has_model_type()` 方法决定该 Provider 能否被某种模型路由到 [`backend/app/models/model/provider.py`](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py)。

## REST API 路由层

### 推理路由

`inference/app/routes` 下按模型能力拆分子路由，所有路由共享统一的请求体模式：`model_schema_id` + `provider_model_id` + 可选的 `proxy`、`custom_headers`、`credentials`/`encrypted_credentials`、`configs` [`inference/app/routes/text_embedding/schema.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py)。`custom_headers` 限制为至多 16 对键值，单键长度 < 64、单值长度 < 512，常用于自建代理或私有化部署。`proxy` 字段在请求校验阶段会与 `CONFIG.PROXY_BLACKLIST` 做对比，黑名单命中则拒绝 [`inference/app/routes/rerank/route.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/route.py)。

### 管理路由

`backend` 服务通过 FastAPI 暴露 `/api/v1/...` 形态的资源端点，覆盖 assistants、chats、messages、retrievals、collections、records、chunks、tools、bundles、models、providers、apikeys 等资源。社区反馈中提到的"首次登录 404 — POST /api/v1/admins/login HTTP/1.1"（[Issue #370](https://github.com/TaskingAI/TaskingAI/issues/370)）即位于这一组管理员认证路由；用户需要先初始化后再登录，否则会出现路由未命中的 404。SDK 与 OpenAI 兼容客户端则通过额外的 `/openai/v1/...` 命名空间访问，v0.3.0 起正式引入 OpenAI 兼容 chat completion 接口（[Release v0.3.0](https://github.com/TaskingAI/TaskingAI/releases/tag/v0.3.0)）。

## 推理服务与模型适配

`inference/app/models` 模块通过抽象基类统一三种能力的调用协议：聊天补全继承 `BaseChatCompletionModel`、文本嵌入继承 `BaseTextEmbeddingModel`、重排序单独由 `BaseRerankModel` 派发 [`inference/app/models/__init__.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/__init__.py)。具体提供商实现放在 `providers/<provider_id>/` 目录，启动时通过反射机制按 `<ProviderId><Task>Model` 命名规则动态加载，例如 `custom_host.RerankModel`、`openai.RerankModel` [`inference/app/cache/rerank.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)。

每个 Provider 实体记录 `enable_proxy`、`enable_custom_headers`、`return_token_usage`、`default_credential_verification_model_type` 等能力位，未开启的能力在请求阶段会被校验剔除 [`inference/app/models/provider.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)。`Model` 实体也保存 `properties`（如 `function_call`、`streaming`）以及 `fallbacks.model_list`，以便在主模型失败时按顺序回退 [`backend/app/models/model/model.py`](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)。当提供商不返回 token usage 时，`BaseChatCompletionModel` 会调用本地 `estimate_input_tokens` 进行估算 [`inference/app/models/chat_completion/model.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/chat_completion/model.py)。

## 常见问题与社区反馈

- **自定义模型 URL / 本地 LLM 集成**：[Issue #11](https://github.com/TaskingAI/TaskingAI/issues/11) 与 [Issue #58](https://github.com/TaskingAI/TaskingAI/issues/58) 反复提到希望 TaskingAI 原生支持 LM Studio、Ollama、Local AI 等本地服务。当前实现通过 `custom_host` Provider 接入，但需要注意：当本地服务遵循 OpenAI 旧版协议而非新版 tools 协议时，会出现 `tool_calls` 与 `functions` 冲突（[Issue #366](https://github.com/TaskingAI/TaskingAI/issues/366)）。
- **代理与缓存**：[Issue #105](https://github.com/TaskingAI/TaskingAI/issues/105) 提议在 REST 层加入统一代理实现消费管理与缓存；现有 `proxy` 字段仅用于上游跳转，黑名单校验参见 [`inference/app/routes/rerank/route.py`](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/route.py)。
- **多租户**：[Issue #342](https://github.com/TaskingAI/TaskingAI/issues/342) 关注多租户隔离能力；当前架构通过 `project_id` 在数据库层做行级隔离，但路径遍历类问题（如 [Issue #374](https://github.com/TaskingAI/TaskingAI/issues/374)、[Issue #375](https://github.com/TaskingAI/TaskingAI/issues/375)）表明 `project_id` 在落盘到插件代码时仍需严格的输入校验。
- **模型注册失败**：[Issue #365](https://github.com/TaskingAI/TaskingAI/issues/365) 中创建 DeepSeek 模型时的报错，与 `provider_id` 在 `credentials_schema` 校验路径上的差异有关，需结合 [`backend/app/models/model/provider.py`](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py) 的 `to_dict()` 国际化逻辑确认。

## See Also

- 助手与会话子系统：[Assistant / Chat / Message 模型层说明]
- 检索增强生成（RAG）：[Collection / Record / Chunk 数据流]
- 工具与插件：[Plugin Bundle 与函数调用约定]
- 推理路由：[chat completion / text embedding / rerank 端点细节

---

<a id='page-3'></a>

## 推理服务与多模型提供商集成

### 相关页面

相关主题：[TaskingAI 概览与系统架构](#page-1), [后端服务、数据库与 REST API](#page-2), [插件系统、前端与部署运维](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)
- [inference/app/models/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)
- [inference/app/cache/rerank.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)
- [inference/app/models/__init__.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/__init__.py)
- [inference/app/routes/text_embedding/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py)
- [inference/app/routes/rerank/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/schema.py)
- [inference/app/models/model_config/resources/configs/top_k.yml](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/model_config/resources/configs/top_k.yml)
- [inference/app/models/model_config/resources/configs/top_p.yml](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/model_config/resources/configs/top_p.yml)
- [backend/app/models/model/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py)
- [backend/app/models/model/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)
</details>

# 推理服务与多模型提供商集成

## 1. 概述与定位

TaskingAI 是一套面向 LLM 应用的 BaaS 平台，其核心目标之一是通过统一 API 屏蔽不同模型提供商之间的差异，使用户能够以一致的方式访问 OpenAI、Anthropic、本地部署的 Ollama / LM Studio / Local AI 等多种模型后端。平台由 `backend`（资源管理与控制面）和 `inference`（实际推理执行面）两个服务组成，推理服务在多模型提供商集成中扮演关键角色。资料来源：[README.md:1-30]()

推理服务对外暴露三类核心能力：**chat_completion**（对话补全）、**text_embedding**（文本向量化）以及 **rerank**（重排序）。每一类都遵循相同的「提供商抽象 + 模型类动态加载 + 配置参数注入」模式。资料来源：[inference/app/models/__init__.py:1-7]()

## 2. 模型提供商抽象

### 2.1 Provider 实体

在推理服务内部，Provider 是对模型后端提供方的统一描述。`inference/app/models/provider.py` 中定义的 `Provider` 类不仅包含 `provider_id`、`credentials_schema` 等元信息，还刻画了与代理、自定义 Header、Token 用量回传等行为相关的开关字段，例如 `enable_proxy`、`enable_custom_headers`、`return_token_usage` 等。资料来源：[inference/app/models/provider.py:6-39]()

后端管理面使用对应的 `backend/app/models/model/provider.py` 中的 `Provider` 类来管理可用模型提供商列表，其 `has_model_type()` 方法会判断该提供商是否支持特定的 `ModelType`（例如 `chat_completion` / `text_embedding` / `rerank`），并通过通配符 `WILDCARD` 支持全模型类型。资料来源：[backend/app/models/model/provider.py:1-23]()

### 2.2 提供商模型类的动态加载

推理服务采用「按需懒加载」的方式实例化提供商模型类。`inference/app/cache/rerank.py` 中的 `get_provider_model_class()` 通过将 `provider_id` 转换为 PascalCase 拼接 `RerankModel` 后缀（如 `openai` → `OpenaiRerankModel`），再使用 `importlib.import_module()` 动态从 `providers.<provider_id>.rerank` 模块导入对应类。模块级字典 `models` 作为进程内缓存，避免重复实例化。资料来源：[inference/app/cache/rerank.py:1-36]()

`load_all_rerank_models()` 进一步扫描 `providers/` 目录，对所有具备 `rerank.py` 的提供商执行加载，并在加载失败时记录错误而不中断服务，体现「插件式」扩展能力。资料来源：[inference/app/cache/rerank.py:38-49]()

```mermaid
flowchart LR
    A[请求进入] --> B{Provider 缓存?}
    B -- 命中 --> C[复用模型实例]
    B -- 未命中 --> D[importlib 加载<br/>providers.<id>.rerank]
    D --> E[实例化 *RerankModel]
    E --> F[写入 models 字典]
    F --> C
    C --> G[执行 rerank 推理]
```

## 3. 推理 API 路由与请求模型

### 3.1 通用请求字段

`text_embedding` 和 `rerank` 两条路由的请求体在结构上高度一致：都允许通过 `proxy` 设置代理地址、通过 `custom_headers` 注入最多 16 对自定义 Header（键长 ≤64，值长 ≤512），并支持以 `credentials` 或 `encrypted_credentials` 二选一的方式提供 API 密钥。该设计便于用户在不修改代码的情况下接入私有网关、代理或自托管服务。资料来源：[inference/app/routes/text_embedding/schema.py:18-58]()、资料来源：[inference/app/routes/rerank/schema.py:1-40]()

### 3.2 文本向量化

`TextEmbeddingRequest` 接受字符串或字符串列表作为输入，并通过 `input_type` 字段区分 `document` 与 `query`，这是 RAG 场景中常见的「检索端使用 `query`，索引端使用 `document`」的最佳实践。响应侧使用 `TextEmbeddingResponse` 包装向量数组。资料来源：[inference/app/routes/text_embedding/schema.py:1-72]()

### 3.3 重排序

`RerankResponse.data.results` 列表中的每个元素都包含 `index`、`document.text` 和 `relevance_score`，以保持与 Cohere / Jina 等主流重排序 API 的响应格式兼容，便于客户端迁移。资料来源：[inference/app/routes/rerank/schema.py:40-70]()

## 4. 模型配置系统

推理服务的模型行为由一系列可调参数（temperature、top_p、top_k、max_tokens 等）控制，这些参数以 YAML 资源文件形式声明，由模型配置系统统一加载与校验。例如 `top_p.yml` 描述了一个取值范围为 `[0.0, 1.0]`、默认 `0.9`、步长 `0.01` 的浮点参数；`top_k.yml` 则定义了取值范围 `[0, 10]`、步长 `1` 的整数参数。资料来源：[inference/app/models/model_config/resources/configs/top_p.yml:1-7]()、资料来源：[inference/app/models/model_config/resources/configs/top_k.yml:1-7]()

后端 `Model` 实体通过 `allow_function_call()` 与 `allow_streaming()` 等方法，将上述配置能力映射为该模型实例是否支持工具调用与流式输出，这些能力在用户创建 Agent、配置 Chat 时会被消费。资料来源：[backend/app/models/model/model.py:1-58]()

## 5. 常见集成问题与社区反馈

根据社区反馈，在多提供商集成过程中，**自定义端点（custom_host）** 与 **本地模型** 是高频痛点：例如 issue #11 反映用户希望自定义 OpenAI 模型 URL；issue #58 指出 LM Studio 等本地 LLM 与 Docker 部署的 TaskingAI 配合时缺少清晰文档；issue #105 则建议为代理场景增加更细粒度的 Token 消耗与缓存控制。这些诉求与 `Provider` 实体中预留的 `enable_proxy`、`enable_custom_headers` 字段形成呼应，是后续迭代的重点方向。资料来源：[issue #11]()、资料来源：[issue #58]()、资料来源：[issue #105]()

## See Also

- [inference/app/main.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/main.py) — 推理服务入口与路由注册
- [inference/app/cache/chat_completion.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/chat_completion.py) — 对话补全模型缓存
- [backend/app/services/model/](https://github.com/TaskingAI/TaskingAI/tree/main/backend/app/services/model) — 后端模型管理服务
- [README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md) — 项目总览与功能列表

---

<a id='page-4'></a>

## 插件系统、前端与部署运维

### 相关页面

相关主题：[TaskingAI 概览与系统架构](#page-1), [后端服务、数据库与 REST API](#page-2), [推理服务与多模型提供商集成](#page-3)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)
- [backend/app/models/tool/plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)
- [backend/app/models/assistant/assistant.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/assistant.py)
- [backend/app/models/assistant/chat.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/chat.py)
- [backend/app/models/assistant/message.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/message.py)
- [backend/app/models/model/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)
- [backend/app/models/model/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/provider.py)
- [backend/app/models/retrieval/collection.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/collection.py)
- [backend/app/models/retrieval/chunk.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/chunk.py)
- [inference/app/cache/rerank.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)
- [inference/app/models/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)
- [inference/app/models/chat_completion/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/chat_completion/model.py)
- [inference/app/routes/text_embedding/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py)
- [inference/app/routes/rerank/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/schema.py)
- [backend/app/models/retrieval/text_splitter/token_handler.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/text_splitter/token_handler.py)
</details>

# 插件系统、前端与部署运维

本文围绕 TaskingAI 平台的三个支撑模块展开：插件（Plugin）系统、前端 Playground/Console、以及 Docker 化部署与运维要点。

## 1. 插件系统

### 1.1 插件模型与函数定义

`Plugin` 模型是插件系统的核心载体，定义在 [backend/app/models/tool/plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)。每个插件归属于一个 `bundle_id`，并维护 `input_schema` 与 `output_schema`：

```python
class Plugin(BaseModel):
    bundle_id: str
    plugin_id: str
    name: str
    description: str
    input_schema: Dict[str, ParameterSchema]
    output_schema: Dict[str, ParameterSchema]
```

`function_def` 把插件自动转换为 OpenAI `ChatCompletionFunction`，供 Assistant 注入到模型请求的 `functions`/`tools` 字段中。转换过程仅保留 `type`、`enum`、`description`，把 `_array` 自动展开为标准 JSON Schema 的 `array+items`，并通过 `i18n_text(bundle_id, ..., "en")` 对描述做国际化（资料来源：[plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)）。

### 1.2 参数校验与安全

`Plugin.validate_input` 在调用前执行入参校验：遍历 `input_schema` 检查 `required` 字段是否缺失，并按 `ParameterType.STRING` 等声明类型做运行时类型检查（资料来源：[plugin.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/tool/plugin.py)）。

社区近期报告的两例高危问题（[#375](https://github.com/TaskingAI/TaskingAI/issues/375) DALL-E 3 工具、`#374` QR Code 生成器）都发生在 `save_url_image` / `save_base64_image` 这类把 `project_id` 直接拼接到文件路径的函数上。新增插件时应避免将用户可控字段直接拼进 `os.path.join`，必须经过 `validate_input` 与路径白名单/沙箱二次校验。

### 1.3 插件与 Assistant 的挂载

Assistant 通过 `tools: List[ToolRef]` 一次性挂载多个插件，同一插件可被多个 Assistant 复用，实现「插件与 Agent 解耦」（资料来源：[backend/app/models/assistant/assistant.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/assistant.py)）。

## 2. 前端控制台与 Playground

README 将 UI Console 列为六大特性之一，原文为「Intuitive UI Console: Simplifies project management and allows in-console workflow testing」（资料来源：[README.md](https://github.com/TaskingAI/TaskingAI/blob/main/README.md)）。控制台承担的关键能力如下表：

| 模块 | 后端数据来源 | 主要能力 |
| --- | --- | --- |
| 模型管理 | `Model` / `Provider` | 注册 OpenAI、Anthropic、Ollama、LM Studio；custom_host 自定义 URL（社区 #11） |
| 插件管理 | `Plugin` | 浏览内置/自定义插件、查看 schema |
| 检索/知识库 | `Collection` / `Chunk` | 文档切片、向量检索 |
| Assistant & Chat | `Assistant` / `Chat` / `Message` | 调试对话、查看 `MessageGenerationLog` |
| Playground（v0.3.0+） | chat completion API | Markdown 渲染、图片上传、OpenAI 兼容协议 |

v0.3.0 在 Playground 中引入 Markdown 内容渲染与图片上传，并新增「OpenAI-compatible chat completion API」（资料来源：[v0.3.0 Release Notes](https://github.com/TaskingAI/TaskingAI/releases/tag/v0.3.0)）。`MessageGenerationLog` 携带 `session_id` / `event` / `event_step` / `timestamp`，支撑控制台按步骤回放消息生成过程，便于调试 Function Call 与 RAG 检索链路（资料来源：[backend/app/models/assistant/message.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/message.py)）。

## 3. 部署与运维

### 3.1 容器化与组件拓扑

TaskingAI 以 Docker Compose 一键部署，README 提出「One-Click to Production」。从源码可见：

- **后端** 基于 FastAPI 提供异步能力，`Chat.memory` 通过 Redis 维护短期上下文（资料来源：[backend/app/models/assistant/chat.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/assistant/chat.py)）；
- **PostgreSQL** 持久化 `Collection`、`Chunk`、`Assistant` 等 `ModelEntity`（资料来源：[backend/app/models/retrieval/collection.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/collection.py)）；
- **inference 服务** 独立部署，对外暴露 chat completion、text embedding、rerank 三类接口。`load_all_rerank_models` 会扫描 `providers/<provider_id>/rerank.py` 并把类名规范化为 `<TitleCase>RerankModel`（资料来源：[inference/app/cache/rerank.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/cache/rerank.py)）；`inference/app/models/provider.py` 维护 `enable_proxy`、`enable_custom_headers`、`return_token_usage` 等 Provider 级能力开关（资料来源：[inference/app/models/provider.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/models/provider.py)）。

### 3.2 常见部署问题

- **首登 404**：`POST /api/v1/admins/login` 触发 404（[#370](https://github.com/TaskingAI/TaskingAI/issues/370)）。通常是数据库迁移未完成或 Nginx upstream 未包含 `backend:5000`。
- **离线部署容器自动退出**：`plugin / inference / server / nginx` 启动后立即重启（[#363](https://github.com/TaskingAI/TaskingAI/issues/363)）。原因是服务启动期会校验外部 Provider 连通性，离线环境需关闭 `enable_proxy` 并预置本地化模型（Ollama / LM Studio）。
- **custom_host 工具调用失败**：第三方 OpenAI 兼容网关同时收到 `functions` 和 `tool_calls` 时会拒绝请求（[#366](https://github.com/TaskingAI/TaskingAI/issues/366)）。可通过 `Model.allow_function_call()` 与 `is_custom_host()` 联合判断（资料来源：[backend/app/models/model/model.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/model/model.py)），请求体中只下发 `tools`。
- **代理与用量管理**：社区 #105 建议接入统一代理做消费与缓存，代码已预留 `proxy`（仅允许 `https://` 开头）、`custom_headers`（最多 16 对 key-value，长度 64/512 限制）字段（资料来源：[inference/app/routes/text_embedding/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/text_embedding/schema.py) 与 [inference/app/routes/rerank/schema.py](https://github.com/TaskingAI/TaskingAI/blob/main/inference/app/routes/rerank/schema.py)）。

### 3.3 检索链路运维要点

`Collection` 维护 `capacity` / `num_chunks` 限制，写入前由 `has_available_capacity(capacity)` 校验（资料来源：[backend/app/models/retrieval/collection.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/collection.py)）。文本切片由 `split_text_by_token` 基于 `default_tokenizer` 按 token 窗口切分并产出 `num_tokens`，便于控制台与 RAG 检索时显示配额（资料来源：[backend/app/models/retrieval/text_splitter/token_handler.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/text_splitter/token_handler.py)）。`Chunk.score` 字段支撑检索结果在 Playground 中按相关性排序展示（资料来源：[backend/app/models/retrieval/chunk.py](https://github.com/TaskingAI/TaskingAI/blob/main/backend/app/models/retrieval/chunk.py)）。

---

## 参见

- 插件安全加固建议（Path Traversal）：[#375](https://github.com/TaskingAI/TaskingAI/issues/375)、[#374](https://github.com/TaskingAI/TaskingAI/issues/374)
- 部署相关社区帖：#370（首登 404）、#363（离线容器重启）、#366（custom_host 工具调用）
- 模型与 Provider 抽象：#11（自定义 URL）、#105（代理与缓存）
- 版本变更：v0.3.0（Markdown Playground、OpenAI 兼容 API）、v0.2.2（`#91` 模型配置更新、`#95` 读 `num_chunk` 修复）

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Doramagic 踩坑日志

项目：TaskingAI/TaskingAI

摘要：发现 15 个潜在踩坑项，其中 1 个为 high/blocking；最高优先级：安全/权限坑 - 来源证据：Advanced Proxy Integration for Consumption Management and Caching in TaskingAI。

## 1. 安全/权限坑 · 来源证据：Advanced Proxy Integration for Consumption Management and Caching in TaskingAI

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Advanced Proxy Integration for Consumption Management and Caching in TaskingAI
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/105 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 2. 安装坑 · 来源证据：First Login caught 404:POST /api/v1/admins/login HTTP/1.1" 404

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：First Login caught 404:POST /api/v1/admins/login HTTP/1.1" 404
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/370 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 3. 安装坑 · 来源证据：GNAP: git-native task coordination for TaskingAI's multi-agent BaaS workflows

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：GNAP: git-native task coordination for TaskingAI's multi-agent BaaS workflows
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/372 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 4. 安装坑 · 来源证据：images auto exit

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：images auto exit
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/363 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 5. 配置坑 · 来源证据：Accept the configs parameter when creating a Chat.

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Accept the configs parameter when creating a Chat.
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/360 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 6. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 证据：capability.assumptions | https://github.com/TaskingAI/TaskingAI | README/documentation is current enough for a first validation pass.

## 7. 运行坑 · 来源证据：An error happens, when creating a new DeepSeek base model in the backend web.

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：An error happens, when creating a new DeepSeek base model in the backend web.
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/365 | 来源类型 github_issue 暴露的待验证使用条件。

## 8. 运行坑 · 来源证据：Tooluse error with custom_host provider

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：Tooluse error with custom_host provider
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/366 | 来源类型 github_issue 暴露的待验证使用条件。

## 9. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 证据：evidence.maintainer_signals | https://github.com/TaskingAI/TaskingAI | last_activity_observed missing

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 证据：downstream_validation.risk_items | https://github.com/TaskingAI/TaskingAI | no_demo; severity=medium

## 11. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 证据：risks.scoring_risks | https://github.com/TaskingAI/TaskingAI | no_demo; severity=medium

## 12. 安全/权限坑 · 来源证据：CRITICAL: Path Traversal in DALL-E 3 Image Tool Allows Arbitrary File Write via project_id Parameter

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：CRITICAL: Path Traversal in DALL-E 3 Image Tool Allows Arbitrary File Write via project_id Parameter
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/375 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 13. 安全/权限坑 · 来源证据：CRITICAL: Path Traversal in QR Code Generator Plugin Allows Arbitrary File Write via project_id Parameter

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：CRITICAL: Path Traversal in QR Code Generator Plugin Allows Arbitrary File Write via project_id Parameter
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 证据：community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/374 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 14. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 证据：evidence.maintainer_signals | https://github.com/TaskingAI/TaskingAI | issue_or_pr_quality=unknown

## 15. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 证据：evidence.maintainer_signals | https://github.com/TaskingAI/TaskingAI | release_recency=unknown

<!-- canonical_name: TaskingAI/TaskingAI; human_manual_source: deepwiki_human_wiki -->