Doramagic Project Pack · Human Manual

TaskingAI

The open source platform for AI-native application development.

Overview

Related topics: App

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: App

Overview

TaskingAI is an open-source BaaS (Backend-as-a-Service) platform for LLM-powered agents, designed to bridge the gap between rapid prototyping in a console and scalable production deployments. It offers a unified API surface for working with multiple model providers, plugins/tools, retrieval-augmented generation (RAG), and persistent conversation state — all packaged as a Docker-deployable, FastAPI-based stack (Source: README.md).

Purpose and Scope

The platform's stated goal is to address shortcomings in frameworks like LangChain (stateless, externally coupled) and OpenAI's Assistants API (proprietary, per-assistant coupling of tools/retrievals). TaskingAI solves these by:

  • Supporting both stateful and stateless usage patterns — persistent chat/assistant sessions coexist with OpenAI-compatible one-shot chat completion endpoints (introduced in v0.3.0).
  • Decoupled modular management — tools, retrievals, and language models are managed independently of any single assistant, enabling multi-tenant reuse.
  • Unified model abstraction — providers such as OpenAI, Anthropic, Ollama, LM Studio, and a custom_host are all exposed through a common schema.
  • Async-first execution — built on Python FastAPI for high-concurrency inference and orchestration.

(Source: README.md)

High-Level Architecture

TaskingAI is composed of several cooperating services. The user-facing entry point is a web console; the backend service manages entities (assistants, chats, models, tools, retrievals, API keys) and orchestrates requests; the inference service runs the actual model calls and exposes OpenAI-compatible chat completion, text embedding, and rerank routes; and a plugin runtime hosts bundled tools (e.g., Google search, web reader, QR code, DALL·E 3 image generation).

flowchart LR
    Client[Client SDK / REST API]
    Console[TaskingAI Console]
    Backend[Backend Service<br/>assistants, chats, models,<br/>tools, retrievals, apikeys]
    Inference[Inference Service<br/>chat_completion,<br/>text_embedding, rerank]
    Providers[(Model Providers<br/>OpenAI, Anthropic,<br/>Ollama, LM Studio)]
    Plugins[Plugin Bundles<br/>search, web reader,<br/>DALL·E 3, QR code]
    Store[(Postgres + Redis +<br/>Object Storage)]

    Client --> Backend
    Console --> Backend
    Backend --> Inference
    Backend --> Store
    Inference --> Providers
    Backend --> Plugins
    Inference --> Plugins

The split between backend/ (entity lifecycle, auth, orchestration) and inference/ (stateless model invocation) is consistent throughout the codebase. For example, the rerank cache loader dynamically imports provider modules such as providers.<provider_id>.rerank and caches instantiated RerankModel classes per process (Source: inference/app/cache/rerank.py).

Core Domain Models

The backend is organized around a small set of Pydantic entities, each living under backend/app/models/:

DomainEntity FilePurpose
Authauth/apikey.pyEncrypted API keys with masked display (apikey[:2] + "*" * … + apikey[-2:]) for project-level access
Modelmodel/model.py, model/provider.pyPer-tenant model instance bound to a provider and schema; supports chat_completion, text_embedding, rerank types and a fallbacks chain
Assistantassistant/assistant.pyConfiguration of model, memory, system_prompt_template, attached tools and retrievals
Chatassistant/chat.pyStateful session with its own memory and metadata, tied to one assistant
Messageassistant/message.pyTurn in a chat; carries role, text content, num_tokens, and per-message generation logs
Tool/Plugintool/plugin.pyBundled functions translated into ChatCompletionFunction schemas for the LLM
Retrievalretrieval/retrieval.py, collection.py, record.py, chunk.pyRAG primitives — Retrieval config, Collection (vector store), Record (source document), Chunk (embedded segment)

The Model entity exposes helper predicates used throughout the orchestrator: is_chat_completion(), is_text_embedding(), is_rerank(), is_custom_host() (when provider_id == "custom_host"), and capability flags like allow_function_call() and allow_streaming() derived from properties (Source: backend/app/models/model/model.py). The Provider entity advertises supported model_types and a credentials_schema, allowing the UI to render appropriate input forms (Source: backend/app/models/model/provider.py).

Inference Subsystem

The inference/ service is the stateless worker layer. It exposes three route families, each with strongly typed Pydantic schemas:

  • Chat completion — primary path used by assistants; delegates to provider implementations registered in providers/<provider_id>/.
  • Text embedding — accepts a string or list of strings, optional proxy, custom_headers (max 16 pairs, key ≤ 64 chars, value ≤ 512 chars), credentials/encrypted_credentials, and a TextEmbeddingModelConfiguration (Source: inference/app/routes/text_embedding/schema.py).
  • Rerank — re-orders candidate documents with a relevance score; same credential/header conventions apply (Source: inference/app/routes/rerank/schema.py).

Model schemas are loaded once into memory with pricing metadata and capability flags (streaming, function_call), driving both the chat completion flow and tool-call validation (Source: inference/app/models/model_schema.py). A known sharp edge here is that custom_host providers must emit tool_calls via the modern tools field rather than legacy functions; otherwise the model layer rejects the request (community issue #366).

RAG and Retrieval Pipeline

Retrievals are configured per-assistant with parameters such as top_k, max_tokens, score_threshold (0–1), and a method (user_message, function_call, etc.). The default config is top_k=3, score_threshold=0.6, method=USER_MESSAGE (Source: backend/app/models/retrieval/retrieval.py).

A Collection is a bounded vector store (with capacity, num_records, num_chunks, an embedding_model_id, and an embedding_size) that hosts Record objects of type text, file, or web (Source: backend/app/models/retrieval/collection.py, backend/app/models/retrieval/record.py). Each record is split via the token-aware splitter into Chunk entities carrying content, num_tokens, and an optional relevance score returned at query time (Source: backend/app/models/retrieval/chunk.py, backend/app/models/retrieval/text_splitter/token_handler.py).

Security, Tooling, and Community Considerations

TaskingAI plugins run as in-process functions; community-reported issues #374 and #375 highlight path-traversal risks in plugins that accept a project_id to construct filesystem paths (e.g., DALL·E 3's save_url_image and the QR code generator's save_base64_image). Any deployment that exposes plugins externally should pin to a patched release and validate project_id strictly. Other recurring community themes include:

  • Local/offline deployments — images pulled for offline use may auto-exit if container network egress is not configured (issue #363).
  • Custom model URLs — long-standing demand for per-model base URLs (issue #11), partially addressed via the custom_host provider and proxy parameters.
  • Consumption caching — request for proxy-side caching to manage token spend (issue #105).
  • Chat-level configuration — accepting temperature/top_p at chat creation (issue #360).
  • First-login UX — POST /api/v1/admins/login returning 404 during initial Docker startup (issue #370) is typically a transient readiness race rather than a misconfiguration.

(Source: README.md, community issues #11, #58, #105, #190, #342, #360, #363, #366, #370, #374, #375)

See Also

  • Assistant and Chat lifecycle
  • Model providers and the inference route layer
  • Retrieval collections, records, and chunks
  • Plugin bundles and tool-call integration
  • API key authentication and multi-tenant isolation

Source: https://github.com/TaskingAI/TaskingAI / Human Manual

App

Related topics: Overview, App

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Backend App Domain Modules

Continue reading this section for the full explanation and source context.

Section Inference App Surfaces

Continue reading this section for the full explanation and source context.

Related topics: Overview, App

App

Overview

TaskingAI is organized as two cooperating Python services that together form the "App" layer of the platform: a backend app that owns state, metadata, agents, and the retrieval/RAG subsystem, and an inference app that provides a thin, provider-agnostic façade for calling LLMs, embedding models, and rerankers. This separation lets the backend stay focused on orchestration and persistence while the inference tier absorbs all provider-specific complexity. According to the README.md, the project is an "All-In-One LLM Platform" delivered as a Docker-deployable BaaS, and the dual-app layout reflects its "BaaS-Inspired Workflow" of separating AI logic from product development.

The inference app exposes three primary inference surfaces — chat completion, text embedding, and rerank — and dynamically loads provider implementations from a directory layout. The backend app exposes the higher-level abstractions (assistants, chats, messages, tools, retrievals, collections, records, chunks, models, providers, and API keys) that user-facing clients manipulate through RESTful APIs.

Architecture and Module Layout

The two apps are loosely coupled: the backend persists configuration and delegates actual model calls to the inference tier. Within the inference tier, the package inference.app.models re-exports the public schema types used by route handlers, as declared in inference/app/models/__init__.py, which exposes base, model_schema, provider, provider_credentials, chat_completion, text_embedding, tokenizer, and model_config.

flowchart LR
    Client[Client SDK / Console] --> Backend[backend/app]
    Backend -- delegate inference --> Inference[inference/app]
    Inference --> Providers[Provider Modules<br/>OpenAI, Anthropic, Ollama, LM Studio, Custom Host]
    Backend --> DB[(Postgres + Redis)]
    Backend --> Storage[(Object Storage<br/>Files / Images)]

Backend App Domain Modules

The backend models under backend/app/models/ form the persistence layer for every user-visible resource. Each module is a Pydantic ModelEntity that knows how to build() from a database row and how to serialize to a response dict.

  • Assistant module — defines the Assistant entity in assistant.py, which references a model_id, a system_prompt_template, an AssistantMemory block, a list of ToolRef, and a list of RetrievalRef with retrieval_configs. The assistant is the top-level agent object users create and configure.
  • Chat and Message modulesChat (see chat.py) represents a conversation session bound to an assistant, including its ChatMemory. Message (see message.py) models individual turns with MessageRole (user/assistant), text MessageContent, and an optional MessageGenerationLog for diagnostics.
  • Retrieval module — three coordinated entities implement RAG: Collection (collection.py) which holds capacity and embedding configuration; Record (record.py) which represents a logical source document of type text, file, or web; and Chunk (chunk.py) which is the unit of retrieval. Text is split into chunks by split_text_by_token in token_handler.py, with chunk_size and chunk_overlap parameters and a title prefix.
  • Model and Provider modulesModel (model.py) captures a configured model instance with model_schema_id, provider_id, provider_model_id, properties, configs, and encrypted credentials. Helper methods expose is_chat_completion(), is_text_embedding(), is_rerank(), is_custom_host(), allow_function_call(), and allow_streaming(). Provider (provider.py) describes the upstream vendor and its credentials_schema.
  • Plugin modulePlugin (see plugin.py) provides a converter that flattens a plugin's input schema into a ChatCompletionFunction for LLM tool use, including internationalization of descriptions.
  • Auth moduleApikey (apikey.py) stores both the encrypted and decrypted form of the API key, returning only a masked version (first two and last two characters) to API consumers.

Inference App Surfaces

The inference app's request schemas, illustrated by text_embedding/schema.py and rerank/schema.py, share a common shape: an optional proxy, an optional custom_headers map (capped at 16 entries with per-key and per-value length limits), a credentials dict or its encrypted twin, plus a configs block and a model-specific payload. The Provider model in inference/app/models/provider.py carries booleans such as enable_proxy, enable_custom_headers, return_token_usage, and return_stream_token_usage, which the routes consult when constructing provider requests.

The provider implementations are discovered at startup. As shown in cache/rerank.py, load_all_rerank_models walks providers/ and imports each vendor's rerank.py, instantiating one cached provider object per provider_id. The same pattern applies to chat completion and text embedding caches, which is why the inference app can add new providers without touching route code.

Common Configuration and Failure Modes

Several cross-cutting parameters recur across modules and are worth knowing when troubleshooting:

ParameterWhere it appearsEffect
proxyinference request schemasOutbound HTTP proxy for the provider call
custom_headersinference request schemasUp to 16 headers with bounded key/value lengths
credentials / encrypted_credentialsinference schemas, Model.encrypted_credentials, Apikey.encrypted_apikeyEither plaintext (transient) or AES-encrypted at rest
fallbacksModel.fallbacksOrdered list of ModelFallback model IDs for retry

Common failure modes reported by the community tie directly to these surfaces:

  • Custom-host tool-use incompatibility — issue #366 reports that pairing a custom_host model with a tool (e.g., arxiv_search) produces Invalid parameter: 'tool_calls' cannot be used when 'functions' are present. This is driven by the allow_function_call() check in model.py, which requires the provider to expose function_call semantics.
  • Path-traversal in plugin image savers — issues #374 and #375 describe vulnerabilities in plugins that write images to disk using unvalidated project_id parameters, which sit inside the Plugin tool execution path of plugin.py.
  • Per-chat configuration — issue #360 requests that the Chat.create call accept a configs parameter; today the only per-message configuration hooks live in Model.configs and the inference configs block.
  • Local-model provider docs — issue #58 highlights confusion around LM Studio and Ollama integration, which the inference cache loader handles via the dynamic providers/ discovery pattern.

See Also

Source: https://github.com/TaskingAI/TaskingAI / Human Manual

Models

Related topics: App

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Fallback chains

Continue reading this section for the full explanation and source context.

Section Custom host and function calling

Continue reading this section for the full explanation and source context.

Section Custom model URLs and local providers

Continue reading this section for the full explanation and source context.

Related topics: App

Models

Overview

In TaskingAI, a Model is the concrete, user-created instance of an AI model that the platform routes inference calls to. While the README frames the platform as providing "access to hundreds of AI models with unified APIs" (README.md:1-50), the runtime Model object is the bridge between an abstract model schema and a real provider such as OpenAI, Anthropic, Ollama, or a self-hosted endpoint.

The Models subsystem spans two services: the backend (backend/app/models/model/), which stores, persists, and exposes Model, Provider, and ModelSchema entities; and the inference layer (inference/app/models/), which executes the actual calls using a per-provider module hierarchy and cached model classes. Source: backend/app/models/model/model.py:1-15.

Model Entity and Core Properties

The Model entity is the user-facing record. Its fields describe both identity and execution metadata:

FieldPurpose
model_idUnique identifier of the created model instance.
model_schema_idPointer to a published schema describing supported properties.
provider_id / provider_model_idWhich provider supplies the model and which upstream model to call.
name, typeDisplay name and the model kind (chat_completion, text_embedding, rerank).
properties, configsVendor-specific properties and unified inference configurations.
encrypted_credentials / display_credentialsSecrets are encrypted at rest; the display form is masked.
fallbacksOptional list of fallback models for resilience.

Source: backend/app/models/model/model.py:1-50.

Model extends the shared ModelEntity and exposes helper methods used throughout the backend. is_chat_completion, is_text_embedding, and is_rerank discriminate by the type field. is_custom_host returns True when provider_id == "custom_host", which is how TaskingAI flags user-defined endpoints that bring their own URL and credentials. Source: backend/app/models/model/model.py:30-50.

Capability flags such as allow_function_call and allow_streaming are derived from properties, ensuring the chat-completion path only enables tool/function calls and streaming when the underlying schema advertises support. Source: backend/app/models/model/model.py:45-55.

Model Types, Providers, and Schema Resolution

The supported ModelType values are listed in inference/app/models/__init__.py, which re-exports base, model_schema, provider, chat_completion, text_embedding, tokenizer, and model_config. A model whose schema type is WILDCARD must be refined by the caller-supplied model_type, otherwise the inference layer raises REQUEST_VALIDATION_ERROR. Source: inference/app/models/model_schema.py:1-60.

The Provider object describes an upstream vendor — its provider_id, name, credentials_schema, icon_svg_url, and a model_types list. Provider.has_model_type() accepts either a concrete type or the wildcard, so a single provider entry can expose chat completion, embeddings, and rerank models under the same identifier. Source: backend/app/models/model/provider.py:1-55.

In the inference service, providers are loaded lazily. inference/app/cache/rerank.py demonstrates the pattern: it converts a snake_case provider_id (e.g., cohere) into a PascalCase class name (CohereRerankModel), imports providers.<provider_id>.rerank, and instantiates and caches the class on first use. load_all_rerank_models walks the providers/ directory at startup to register every provider that ships a rerank.py module. Source: inference/app/cache/rerank.py:1-55.

The inference-layer Provider extends the backend concept with execution toggles such as enable_proxy, enable_custom_headers, return_token_usage, return_stream_token_usage, and pass_provider_level_credential_check. Source: inference/app/models/provider.py:1-60.

Configuration, Credentials, and Request Options

Model configuration is presented uniformly regardless of provider. The i18n file lists the canonical fields: max_tokens, stop, temperature, top_k, top_p, frequency_penalty, and presence_penalty — each with a name and description string shown in the console and Playground. Source: inference/app/models/model_config/resources/i18n/en.yml:1-30.

Per-request overrides are accepted on the text embedding route, mirroring the chat completion endpoint. The TextEmbeddingRequest schema accepts proxy (optional HTTP proxy string), custom_headers (up to 16 key-value pairs), credentials or encrypted_credentials (mutually exclusive), properties (provider-specific overrides), configs (a TextEmbeddingModelConfiguration object), and an input_type discriminator (document or query). Source: inference/app/routes/text_embedding/schema.py:1-80.

Credentials are stored encrypted on the Model row; the display form returned to the console is masked. API keys are likewise decrypted only when needed via aes_decrypt on the backend, as shown in Apikey.build. Source: backend/app/models/auth/apikey.py:1-40.

Fallbacks, Function Calls, and Community Considerations

Fallback chains

ModelFallbackConfig wraps a model_list of ModelFallback entries (model_id only). This lets the inference service degrade gracefully when a primary model fails — a recurring community ask for production reliability. Source: backend/app/models/model/model.py:1-25.

Custom host and function calling

The is_custom_host flag combined with allow_function_call is significant. Community issue #366 reports that some custom-host providers reject requests when both tool_calls (the OpenAI tools parameter) and functions (legacy) are present. Because allow_function_call is derived from schema properties, operators must ensure the underlying schema advertises function-call support and that the request payload uses the modern tools field. Source: backend/app/models/model/model.py:30-55.

Custom model URLs and local providers

Community issue #11 requests the ability to define a custom model URL for OpenAI models. This maps to TaskingAI's custom_host provider: a Model whose provider_id == "custom_host" accepts arbitrary provider_model_id strings and per-model credentials, which is the mechanism used to integrate LM Studio and Ollama endpoints. Issue #58 further highlights that local-server + Dockerized TaskingAI is the common deployment pattern for these providers. Source: backend/app/models/model/model.py:30-50.

Proxy, custom headers, and token usage

Issue #105 requests proxy integration for token consumption and caching. The inference Provider already exposes enable_proxy and enable_custom_headers, and the text embedding schema accepts proxy and custom_headers per request — meaning today the feature is wired in but operated at the request level rather than centrally cached. Source: inference/app/models/provider.py:1-60, inference/app/routes/text_embedding/schema.py:1-80.

Per-Chat model configuration

Issue #360 requests per-Chat overrides for temperature, top_p, etc. The unified configs field on Model and the TextEmbeddingModelConfiguration object already standardize these parameters, so the work is largely on exposing them at Chat creation time. The v0.2.2 release notes confirm that model configuration updates are supported on the Model itself. Source: inference/app/models/model_config/resources/i18n/en.yml:1-30.

See Also

Source: https://github.com/TaskingAI/TaskingAI / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 15 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

1. Security or permission risk: Security or permission risk requires verification

  • Severity: high
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/105

2. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/370

3. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/372

4. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/363

5. Configuration risk: Configuration risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/360

6. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/TaskingAI/TaskingAI

7. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/365

8. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/366

9. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/TaskingAI/TaskingAI

10. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/TaskingAI/TaskingAI

11. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/TaskingAI/TaskingAI

12. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/TaskingAI/TaskingAI/issues/375

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using TaskingAI with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence