LazyLLM Manual - Doramagic.ai

Doramagic Project Pack · Human Manual

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications.

LazyLLM Overview and System Architecture

Related topics: Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Module System

Continue reading this section for the full explanation and source context.

Section Flow System

Continue reading this section for the full explanation and source context.

Section Core Agent Types

Continue reading this section for the full explanation and source context.

LazyLLM Overview and System Architecture

Purpose and Scope

LazyLLM is a low-code development tool for building multi-agent large language model (LLM) applications. It enables developers to assemble complex AI applications from reusable modules, flows, and components without deep knowledge of LLM infrastructure, prompt engineering, or deployment plumbing. Source: README.md.

The project positions itself around four design pillars documented in the README:

Convenient AI Application Assembly — Lego-like composition of agents, data flows, and functional modules.
One-Click Deployment — Lightweight gateway during POC, and one-click image packaging for production.
Cross-Platform Compatibility — Single code path across bare-metal, dev machines, Slurm clusters, and public clouds.
Unified User Experience — A single API surface for online (OpenAI, SenseNova, Kimi, ChatGLM, etc.) and locally deployed models. Source: README.md.

The application development lifecycle follows a prototype → data feedback → iterative optimization loop, where LazyLLM aims to support each stage — from rapid prototyping through fine-tuning and production deployment. Source: README.md.

High-Level System Architecture

LazyLLM is organized as a layered system. At the top, the CLI (lazyllm command) provides entry points for installation, deployment, running, skills, and code review. Source: lazyllm/cli/main.py. Below it, the Module system and Flow system form the core abstractions for building applications, while Agents, RAG infrastructure, and Deployment infrastructure sit on top as higher-level building blocks.

graph TB
    subgraph CLI["CLI Layer (lazyllm cli/main.py)"]
        Install["install"]
        Deploy["deploy"]
        Run["run"]
        Skills["skills"]
        Review["review / review-local"]
    end

    subgraph Core["Core Abstractions"]
        Modules["Module System<br/>(ModuleBase, TrainableModule,<br/>OnlineChatModule, etc.)"]
        Flows["Flow System<br/>(Pipeline, Parallel,<br/>Loop, IFS, Warp)"]
    end

    subgraph HighLevel["High-Level Components"]
        Agents["Agent System<br/>(ReactAgent, PlanAndSolveAgent,<br/>ReWOOAgent, FunctionCall)"]
        RAG["RAG Infrastructure<br/>(Document, Retriever,<br/>Reranker, Splitter)"]
        DeployInfra["Deployment Infrastructure<br/>(ServerModule, WebModule,<br/>TrainableModule)"]
    end

    subgraph Backends["Backend Integrations"]
        LocalModels["Local Inference<br/>(lightllm, vllm)"]
        OnlineModels["Online Providers<br/>(OpenAI, SiliconFlow,<br/>MiniMax, SenseNova, ...)"]
        Storage["Storage<br/>(Elasticsearch, OceanBase,<br/>Milvus, ChromaDB)"]
    end

    CLI --> Core
    Core --> HighLevel
    HighLevel --> Backends

Core Abstractions: Modules and Flows

Module System

The Module system is the foundational abstraction. LazyLLM provides a structured taxonomy of module types, each combining training, fine-tuning, serving, and deployment capabilities. Source: README.md.

Module Type	Purpose	Train	Fine-tune	Serve	Deploy
`ModuleBase`	Wrap any callable into a Module	—	—	—	—
`ActionModule`	Trainable & deployable wrapper	✅	✅	✅	✅
`UrlModule`	Wrap external URLs as Modules	❌	❌	✅	✅
`ServerModule`	Wrap any callable as an API service	❌	✅	✅	✅
`TrainableModule`	Base for all supported models	✅	✅	✅	✅
`WebModule`	Multi-round dialogue interface	❌	✅	❌	✅
`OnlineChatModule`	Online chat (training + inference)	✅	✅	✅	✅
`OnlineEmbeddingModule`	Online embedding inference	❌	✅	✅	✅

Source: README.md.

Flow System

Flows describe how data is passed between callable objects. LazyLLM ships with predefined flow primitives: Pipeline, Parallel, Diverter, Warp, IFS, and Loop. These can be composed recursively with Modules, Components, or any Python callable. The flow abstraction makes it simple to add, replace, and reorganize components without rewriting application code. Source: README.md.

Agent Subsystem

The lazyllm/tools/agent/ directory implements LazyLLM's Agent system. Source: lazyllm/tools/agent/AGENTS.md.

Core Agent Types

File	Agent
`base.py`	`LazyLLMAgentBase` — common base for all agents
`functionCall.py`	`FunctionCall` / `FunctionCallAgent` — single-turn tool-call execution
`reactAgent.py`	`ReactAgent` — ReAct loop agent
`planAndSolveAgent.py`	`PlanAndSolveAgent` — plan-then-execute agent
`rewooAgent.py`	`ReWOOAgent` — blueprint + evidence + answer agent
`toolsManager.py`	`ToolManager`, `ModuleTool`, `register` — tool registration
`skill_manager.py`	`SkillManager` — workflow-style skill management

Source: lazyllm/tools/agent/AGENTS.md.

ReAct Loop Flow

ReactAgent wraps FunctionCall in a Loop with a stop condition. On each iteration the LLM is asked to reason and either emit tool calls or a final string answer:

The agent builds history messages and injects them into locals['_lazyllm_agent']['workspace']. Source: lazyllm/tools/agent/AGENTS.md.
LLM output is parsed in _post_action. If tool_calls are present, the ToolManager executes them and returns a dict, continuing the loop. Otherwise a str is returned, triggering the loop's stop condition (isinstance(x, str)). Source: lazyllm/tools/agent/AGENTS.md.

The base agent LazyLLMAgentBase accepts parameters for LLM, tools, max retries, streaming, return-trace, skills, memory, sandbox, and file-system access. Source: lazyllm/tools/agent/base.py.

Tool Registration

Tools can be registered by inheriting ModuleTool (recommended for complex tools) — the class reads the apply method's docstring, type hints, and signature to construct an LLM-callable schema. Source: lazyllm/tools/agent/toolsManager.py.

Skill Management

SkillManager manages reusable workflows ("skills") with the get_skill / read_reference / run_script tool trio. The skill system enforces strict rules: agents must call get_skill first to retrieve SKILL.md, and reference/script paths must be copied verbatim from the skill's documentation — fabricated paths are forbidden. Source: lazyllm/tools/agent/skill_manager.py.

Command-Line Interface

The lazyllm CLI exposes five subcommands routed in lazyllm/cli/main.py:

lazyllm install [...] — install dependencies for a model/project. Source: lazyllm/cli/main.py.
lazyllm deploy <model> [...] — deploy an LLM service (e.g., VLLM deployments support restricted parameters; bypass via LAZYLLM_VLLM_SKIP_CHECK_KW=True). Source: lazyllm/cli/README.md.
lazyllm run [...] — run a project.
lazyllm skills <list|info|delete|add|import|install> [...] — manage skills, including installing them into a project or agent. Source: lazyllm/cli/skills.py.
lazyllm review --pr <number> [...] and lazyllm review-local [...] — multi-round AI code review for GitHub PRs or local git branches; the local variant diffs against a base branch via git merge-base and writes a JSON report. Source: lazyllm/cli/review.py.

RAG and Data Subsystems

LazyLLM integrates a complete RAG stack that includes:

Engineering: Horizontal scaling of RAG modules, multi-knowledge-base Q&A, and LazyRAG integration (V0.7). Source: README.md.
Data Capabilities: Table parsing, CAD image parsing, and pretrain data processing. Source: README.md.
Algorithm Capabilities: Structured-text processing (CSV), multi-hop retrieval, information-conflict handling, and agentic-RL problem solving. Source: README.md.

A typical RAG pipeline uses Document with Retriever, Reranker, and SentenceSplitter components wired together through pipeline and parallel flows. The README demonstrates online deployments combining OnlineEmbeddingModule with cosine/B M25 retrievers and a ModuleReranker. Source: README.md.

Prompt Templates and Data Lineage

LazyLLM ships a curated set of prompt templates in lazyllm/prompt_templates/prompts_actor/. The project tracks data lineage and licensing for these resources:

awesome-chatgpt-prompts-zh.json (124 Chinese prompts) — MIT licensed, sourced from PlexPt/awesome-chatgpt-prompts-zh, lightly reformatted. Source: lazyllm/prompt_templates/prompts_actor/README.md.
prompts.chat.json (1192 English prompts) — CC0-1.0 licensed, sourced from f/prompts.chat, with normalization and duplicate removal. Source: lazyllm/prompt_templates/prompts_actor/README.md.

"Lightly modified" in this context means key-name normalization, whitespace fixes, and minor wording adjustments — no wholesale rewriting of original content. Source: lazyllm/prompt_templates/prompts_actor/README.md.

Roadmap and Recent Milestones

Per the v0.7.1 release notes (current latest stable referenced in community context), recent milestones include:

Agent Module Refactor — major rewrite for maintainability.
New storage providers — Elasticsearch, OceanBase.
New online model providers — SiliconFlow, MiniMax.
Comprehensive caching system for performance gains.
Document parsing service and startup system refactors.

Source: Community release notes.

Open community feature requests (e.g., interleaved text+image content for OnlineModule(type='image_editing') in issue #1035) indicate ongoing evolution of online module capabilities, while documentation build issues (e.g., issue #655) reflect active investment in tutorial and learning material quality.

Components, Modules, and Flows

Related topics: LazyLLM Overview and System Architecture, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Components, Modules, and Flows

Overview

LazyLLM is a framework for building AI applications by composing reusable units. The project organizes its building blocks into three primary abstractions: Components (low-level utilities such as model downloaders and prompt templates), Modules (high-level wrappers that encapsulate models, services, and callable logic), and Flows (data-stream primitives that connect Modules and Components into executable graphs). Together they let developers "wrap functions, modules, flows, etc., into a Module" and assemble multi-agent applications with a Lego-like experience (README.md).

The framework emphasizes four goals that shape its design (README.md):

Convenient assembly — pipelines can be expressed declaratively with Flows.
One-click deployment — Modules can be promoted to services without rewriting.
Cross-platform compatibility — the same code runs on bare-metal, Slurm, and public clouds.
Unified experience — online and local model providers share a single interface.

Module Hierarchy

Modules in LazyLLM are typed wrappers. The README documents the canonical set and the capabilities each one offers (README.md):

Module	Purpose	Training	Fine-tune	Deploy
UrlModule	Wraps any URL into a Module to access external services	❌	❌	✅
ServerModule	Wraps any function, flow, or Module into an API service	❌	✅	✅
TrainableModule	Trainable Module; all supported models are TrainableModules	✅	✅	✅
WebModule	Launches a multi-round dialogue interface service	❌	✅	❌
OnlineChatModule	Integrates online model fine-tuning and inference services	✅	✅	✅
OnlineEmbeddingModule	Integrates online Embedding model inference services	❌	✅	✅

These Modules are composed of lower-level Components, such as model_mapping.py, which maps model identifiers to Hugging Face / ModelScope namespaces and to model-specific prompt keys (sos, soh, soa, stop_words, etc.) for chat-template construction (lazyllm/components/utils/downloader/model_mapping.py). For example, the deepseek entry defines sos: '<｜begin▁of▁sentence｜>' and stop_words: ['<｜end▁of▁sentence｜>'] so that prompts and stop tokens are produced automatically (lazyllm/components/utils/downloader/model_mapping.py).

Flow System

A Flow is a data-stream primitive: it describes how a value is passed from one callable object to another. According to the project README, LazyLLM ships with Pipeline, Parallel, Diverter, Warp, IFS, and Loop flows, which together "can cover almost all application scenarios" (README.md). Flows are the mechanism by which complex graphs are assembled from Modules and Components without manual plumbing.

The Loop primitive is also the workhorse of the agent system: ReactAgent wraps FunctionCall inside a Loop, with a stop condition that fires when FunctionCall returns a str (final answer) instead of a dict (tool calls) (lazyllm/tools/agent/AGENTS.md). This means the same Loop abstraction is reused for both data-flow graphs and agent reasoning loops.

Agent Subsystem

Agents are first-class Modules that combine an LLM with a tool registry. The framework ships four agent implementations, each suited to a different reasoning style (lazyllm/tools/agent/AGENTS.md):

ReactAgent — Reason→Act→Observe loop; the default for general multi-step tool use.
PlanAndSolveAgent — Planner decomposes a task; Solver executes the plan.
ReWOOAgent — Planner emits a blueprint; Workers collect evidence in parallel; Solver returns the answer.
FunctionCallAgent — Deprecated single-shot tool caller; superseded by ReactAgent.

All four share FunctionCall as their inner execution unit. A single round follows this pattern (lazyllm/tools/agent/AGENTS.md):

flowchart TD
    A[Input] --> B[_build_history]
    B --> C[LLM reasoning]
    C --> D{tool_calls?}
    D -- yes --> E[ToolManager.execute]
    E --> F[dict: continue Loop]
    D -- no --> G[str: stop Loop]

The ReactAgent prompt template encodes the same loop explicitly: "Reason → Act → Observe → Reflect", with a hard rule of "at most one tool per action step" and a final-answer rule that breaks out of the loop (lazyllm/tools/agent/reactAgent.py). A _FORCE_SUMMARIZE_MSG is injected when the agent exhausts max_retries, telling the LLM to "Stop calling tools now and provide your final answer immediately" (lazyllm/tools/agent/reactAgent.py).

Tools are registered through ModuleTool and ToolManager. ModuleTool parses the function's docstring and type hints to build a Pydantic schema for the LLM, raising an error if the docstring return type and the Python return annotation disagree (lazyllm/tools/agent/toolsManager.py). When variable-argument functions are used, the schema falls back to the docstring types rather than the runtime signature (lazyllm/tools/agent/toolsManager.py). Built-in tools such as write_file are registered through the @register('builtin_tools', ...) decorator (lazyllm/tools/agent/file_tool.py).

Beyond built-in tools, users can install external Skills from GitHub with install_skill (lazyllm/tools/agent/skill_hub.py). The skill hub fetches the repository file tree via the Git Trees API, locates a SKILL.md, and exposes the skill's workflow to the agent. The skill manager's prompt enforces a strict prerequisite: read_reference and run_script may only be called after the agent has fetched the skill's SKILL.md, and rel_path values must be copied verbatim from that file (lazyllm/tools/agent/skill_manager.py).

CLI Surface

The framework exposes a unified CLI for the full lifecycle. lazyllm deploy starts a model service (e.g. lazyllm deploy llama2 --tp=2) (lazyllm/cli/README.md), and the top-level dispatcher routes install, deploy, run, skills, review, and review-local subcommands (lazyllm/cli/main.py). The review subcommand performs multi-round AI code review on a local repository, diffing the current branch against a base using git merge-base and writing the result to JSON (lazyllm/cli/review.py).

Community Notes

Feature parity for image editing — Issue #1035 reports that OnlineModule(type='image_editing') lacks interleaved text+image content support, an example of the kind of capability gap that flows through the OnlineChatModule/OnlineEmbeddingModule table above.
Documentation rendering bugs — Issue #655 notes that several tutorial page headings fail to compile, which has a direct impact on discoverability of the Flow and Module APIs documented here.
Release v0.7.1 — The release notes flag a major Agent-module refactor — a relevant heads-up for anyone tracking the ReactAgent / FunctionCall code paths cited above.

RAG Pipeline, Document Processing, and Stores

Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, Agents, Tools, Memory, and Online Model Integration

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Node groups and splitting strategies

Continue reading this section for the full explanation and source context.

Section Standalone parsing service

Continue reading this section for the full explanation and source context.

Section Embedding and online modules

Continue reading this section for the full explanation and source context.

RAG Pipeline, Document Processing, and Stores

Overview

LazyLLM provides a first-class Retrieval-Augmented Generation (RAG) stack that combines a Document index, pluggable node groups (splits), Retrievers, and Rerankers into a Flow-compatible pipeline. The RAG subsystem targets three goals: (1) support 20+ splitting strategies and many document types, (2) horizontally scale across multiple knowledge bases and machines, and (3) integrate at least one open-source knowledge-graph framework. Source: README.md

The v0.7.1 release expanded the storage ecosystem with Elasticsearch and OceanBase backends, added SiliconFlow and additional online providers, and refactored the document parsing service and launcher systems for better maintainability. The release also introduced a comprehensive caching layer that accelerates repeated RAG queries. Source: README.md

Architecture and Data Flow

A RAG application in LazyLLM is composed of four cooperating layers:

flowchart LR
    A[Raw files / URL] --> B[Document Parser]
    B --> C[Node Groups<br/>Sentences / CoarseChunk / KB]
    C --> D[Embedding / BM25 Index]
    D --> E[Retriever]
    E --> F[Reranker]
    F --> G[LLM Prompt + Answer]

Document owns the dataset, parsers, and one or more node groups Source: lazyllm/tools/rag/document.py
Node groups are transformed views of the document (e.g. Sentences, CoarseChunk, knowledge-graph triples) Source: lazyllm/tools/rag/doc_node.py
Retriever is a callable that queries a node group with a similarity function Source: lazyllm/tools/rag/retriever.py
Reranker reorders retrieved nodes before they are passed to the LLM Source: lazyllm/tools/rag/rerank.py

A canonical end-to-end pipeline (from the project README) wires these layers with pipeline and parallel Flows:

import lazyllm
from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker

documents = Document(
    dataset_path="your data path",
    embed=lazyllm.OnlineEmbeddingModule(),
    manager=False,
)
documents.create_node_group(
    name="sentences",
    transform=SentenceSplitter,
    chunk_size=1024,
    chunk_overlap=100,
)

with pipeline() as ppl:
    with parallel().sum as ppl.prl:
        prl.retriever1 = Retriever(documents, group_name="sentences",
                                   similarity="cosine", topk=3)
        prl.retriever2 = Retriever(documents, "CoarseChunk",
                                   "bm25_chinese", 0.003, topk=3)
    ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) \
                   | bind(query=ppl.input)
    ppl.formatter = (lambda nodes, query: dict(
        context_str="".join([node.get_content() for node in nodes]), query=query)) \
        | bind(query=ppl.input)
    ppl.llm = lazyllm.OnlineChatModule(stream=False).prompt(
        lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))

Source: README.md:0-0

Document Processing

Document is the central entry point. It accepts a dataset_path (a local directory or a URL when used in client mode), an embed module, and an optional manager flag. The manager flag controls whether a built-in DocServer and UI are spawned. Source: examples/rag_with_parsing_service/README.md

Node groups and splitting strategies

LazyLLM exposes splitting strategies through transform callables. The default SentenceSplitter accepts chunk_size and chunk_overlap. Beyond sentence-level splits the system supports structured strategies such as CoarseChunk (used for BM25 retrieval in the demo) and a knowledge-graph extractor, with the stated goal of supporting "no less than 20 types" of splitters across the v0.6–v0.8 roadmap. Source: README.md

Standalone parsing service

For high-throughput or multi-process deployments, the parser can be detached into a service. DocumentProcessor(url=...) points a Document at a remote parser, disables local file-change monitoring, and requires a persistent store_conf (a pure in-memory map store cannot be shared across processes — use OpenSearch, Milvus, Elasticsearch, OceanBase, etc.). The example ships three scripts:

Script	Purpose
`server_with_worker.py`	Run parser server + worker in one process
`server_and_separate_workers.py`	Run parser server; start workers separately via `DocumentProcessorWorker`
`document.py`	Register a `Document` with the parsing service
`retriever_using_url.py`	Query the document remotely via its URL

Source: examples/rag_with_parsing_service/README.md

Embedding and online modules

Document accepts any callable that conforms to the embedding contract. OnlineEmbeddingModule is the zero-setup choice, and additional online providers (SiliconFlow, etc.) were added in v0.7.1. Source: README.md

Stores and Indexes

Stores hold both raw segments and indexed vectors. The default indexer is implemented in default_index.py and supports vector similarity, BM25 keyword search, and knowledge-graph lookups. Source: lazyllm/tools/rag/default_index.py

Backend	Type	Use case
Map (in-memory)	Vector / segment	Single-process demos; not for shared deployments
Milvus	Vector	Production vector search
OpenSearch	Vector + keyword	Hybrid search in distributed setups
Elasticsearch	Vector + keyword	Added in v0.7.1; horizontal scaling
OceanBase	Vector + keyword	Added in v0.7.1; SQL-compatible hybrid store

Source: README.md, examples/rag_with_parsing_service/README.md

A common pitfall: when manager=False is combined with a remote parser, store_conf must not be a pure map store, because map stores have no persistence and cannot be shared across processes. Source: examples/rag_with_parsing_service/README.md

Retrievers and Rerankers

Retriever(documents, group_name, similarity, topk) queries a single node group. The similarity argument selects the algorithm — "cosine" for dense vectors, "bm25_chinese" (or "bm25") for keyword search, plus a similarity threshold such as 0.003. Multiple retrievers can be combined in parallel().sum to merge their hits. Source: lazyllm/tools/rag/retriever.py, README.md

Reranker(name, model, topk) wraps a model-based reranker. ModuleReranker uses a HuggingFace-compatible model such as bge-reranker-large; other registered backends plug in custom scorers. Because Rerankers accept and return node lists, they slot directly into a pipeline and can be bind-ed to the user query. Source: lazyllm/tools/rag/rerank.py

The v0.7.1 release also extended the RAG module with multi-hop retrieval (following links and references inside documents), information-conflict handling, AI Writer, and AI Review capabilities — these are exposed as additional retriever/reasoning components on top of the core pipeline. Source: README.md

Common Failure Modes and Gotchas

Map store in distributed mode — causes silent data loss across workers; switch to Milvus, OpenSearch, Elasticsearch, or OceanBase. Source: examples/rag_with_parsing_service/README.md
Parser URL unreachable — when DocumentProcessor(url=...) cannot reach the parser, registration and dataset_path monitoring are disabled; verify the URL and that the worker has started. Source: examples/rag_with_parsing_service/README.md
Splitter mismatch — calling Retriever with a group_name that does not exist on the Document raises immediately; always create the node group with create_node_group first. Source: lazyllm/tools/rag/document.py
Top-K tuning — dense and BM25 retrievers typically return overlapping but non-identical hits; merging via parallel().sum improves recall but inflates tokens, so set topk on the reranker to keep the prompt bounded. Source: README.md

Agents, Tools, Memory, and Online Model Integration

Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores

Section Related Pages

Continue reading this section for the full explanation and source context.

Agents, Tools, Memory, and Online Model Integration

Overview

LazyLLM exposes a unified Agent surface that combines a set of reusable reasoning loops, a registry-based tool system, persistent memory and skills, and a pluggable online model layer. According to the project README, the framework targets "convenient AI application assembly" with one-click deployment and a consistent user experience across locally deployed and online models. Source: README.md:8-22.

The release notes for v0.7.1 highlight a "major change: Agent module refactor" and additions of new online model providers such as SiliconFlow and MiniMax, together with a comprehensive caching system. Source: README.md. Community issue #1035 reports that OnlineModule(type='image_editing') does not yet support interleaved text+image content, illustrating a known limitation of the online model integration layer.

Agent System

The Agent subsystem is implemented under lazyllm/tools/agent/. Per the directory's AGENTS guide, every concrete Agent inherits from LazyLLMAgentBase and delegates a single "reason + tool call" round to FunctionCall. Source: lazyllm/tools/agent/AGENTS.md:30-40.

Four Agent classes ship out of the box:

Agent	Working method	Typical use case
`ReactAgent`	Reason → Act → Observe loop until final answer	Multi-step tasks with tool use
`PlanAndSolveAgent`	Planner decomposes subtasks; Solver executes	Tasks needing upfront planning
`ReWOOAgent`	Planner generates a blueprint; Worker gathers evidence; Solver answers	Parallelizable evidence collection
`FunctionCallAgent`	Direct tool selection (deprecated, prefer `ReactAgent`)	Simple tool calls

Source: lazyllm/tools/agent/AGENTS.md:50-62.

ReactAgent wraps FunctionCall in a Loop, stopping when the output becomes a str (the final answer) and continuing while it remains a dict containing tool_calls. Source: lazyllm/tools/agent/AGENTS.md:18-30. The class prompt explicitly enforces "use at most one tool per action step" and "do not call any tools after you already have enough information to answer." Source: lazyllm/tools/agent/reactAgent.py:1-60.

The execution flow for one round is:

flowchart TD
    A[input] --> B[_build_history<br/>injects workspace locals]
    B --> C[LLM reasoning]
    C --> D{has tool_calls?}
    D -- yes --> E[ToolManager._execute_tool]
    E --> F[returns dict<br/>continue Loop]
    D -- no --> G[returns str<br/>stop Loop]

Conversation history is stored in locals['_lazyllm_agent']['workspace'] rather than instance attributes so that concurrent requests do not leak history across users. Source: lazyllm/tools/agent/AGENTS.md:30-46.

Tools and Skills

The ToolManager owns tool registration, schema generation, and execution. It wraps user tools in ModuleTool and generates an OpenAI function-calling tools_description from each tool's docstring. Source: lazyllm/tools/agent/AGENTS.md:96-118.

A tool's docstring must follow a strict format — first-line short description, an Args: block, type annotations, and a Returns: block — or the LLM cannot generate a valid schema. Source: lazyllm/tools/agent/AGENTS.md:78-94. Tools can be registered either by inheriting ModuleTool or by passing plain callables, and they live in a temporary group (tmp_tool) that is discarded after the call.

Complementing transient tools, SkillManager provides persistent, named skills that an Agent can recall mid-conversation. Source: lazyllm/tools/agent/skill_manager.py:1-30. The skill_manager prompt mandates a strict prerequisite: an Agent must call get_skill to load SKILL.md *before* using read_reference or run_script, and the rel_path argument must be copied verbatim from that document — fabrication is explicitly forbidden. Source: lazyllm/tools/agent/skill_manager.py:14-42.

The CLI exposes skill operations through lazyllm skills ..., supporting init, list, info, add, delete, import, and install --agent. Source: lazyllm/cli/skills.py:1-40. Top-level commands such as install, deploy, run, skills, review, and review-local are dispatched in lazyllm/cli/main.py:18-32. The deploy subcommand can launch local model servers (for example via vLLM with tensor parallelism), and is governed by an allow-list governed by LAZYLLM_VLLM_SKIP_CHECK_KW. Source: lazyllm/cli/README.md:1-40.

Online Model Integration

LazyLLM unifies locally trained and hosted models behind the same Module API. The README documents OnlineChatModule (integrates online model fine-tuning and inference) and OnlineEmbeddingModule (online embedding inference), both of which support training, inference, deployment, and serving in the same way as their local counterparts. Source: README.md:58-72.

Per-model prompt tokens are stored in model_mapping.py, which defines prompt_keys (such as sos, soh, soa, eoa, stop_words, and system) for families including internlm, internlm2, chatglm3, glm-4, baichuan2, deepseek, and Llama-3. Source: lazyllm/components/utils/downloader/model_mapping.py:1-40.

Online provider configuration is sourced from ~/.lazyllm/config.json or environment variables such as LAZYLLM_OPENAI_API_KEY, as shown in the chatbot example in the README. Source: README.md:30-50. Memory itself is delivered as a built-in functional module that "supports memory capabilities," listed under Feature Modules. Source: README.md:96-108.

The prompt library shipped at lazyllm/prompt_templates/prompts_actor/ aggregates 124 Chinese prompts from awesome-chatgpt-prompts-zh (MIT) and 1192 English prompts from prompts.chat (CC0-1.0), lightly normalized to fit the project's schema. Source: lazyllm/prompt_templates/prompts_actor/README.md:1-26.

Common Pitfalls

Bad tool docstrings. Tools without a properly structured Args: block cannot be invoked correctly by the LLM. Source: lazyllm/tools/agent/AGENTS.md:90-94.
Fabricated skill paths. read_reference and run_script must use paths copied verbatim from SKILL.md; any fabricated path violates the skill protocol. Source: lazyllm/tools/agent/skill_manager.py:18-30.
Online module limitations. OnlineModule(type='image_editing') does not yet accept interleaved text+image content — see community issue #1035.
vLLM parameter gating. Custom vLLM flags are rejected unless LAZYLLM_VLLM_SKIP_CHECK_KW=True is exported. Source: lazyllm/cli/README.md:14-30.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.

1. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/LazyAGI/LazyLLM

2. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM

3. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/LazyAGI/LazyLLM

4. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/LazyAGI/LazyLLM

5. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM

6. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: release_recency=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 2

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using LazyLLM with real data or production workflows.

v0.7.1 - github / github_release
Capability evidence risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

LazyLLM

LazyLLM Overview and System Architecture

Related Pages

LazyLLM Overview and System Architecture

Purpose and Scope

High-Level System Architecture

Core Abstractions: Modules and Flows

Module System

Flow System

Agent Subsystem

Core Agent Types

ReAct Loop Flow

Tool Registration

Skill Management

Command-Line Interface

RAG and Data Subsystems

Prompt Templates and Data Lineage

Roadmap and Recent Milestones

See Also

Components, Modules, and Flows

Related Pages

Components, Modules, and Flows

Overview

Module Hierarchy

Flow System

Agent Subsystem

CLI Surface

Community Notes

See Also

RAG Pipeline, Document Processing, and Stores

Related Pages

RAG Pipeline, Document Processing, and Stores

Overview

Architecture and Data Flow

Document Processing

Node groups and splitting strategies

Standalone parsing service

Embedding and online modules

Stores and Indexes

Retrievers and Rerankers

Common Failure Modes and Gotchas

See Also

Agents, Tools, Memory, and Online Model Integration

Related Pages

Agents, Tools, Memory, and Online Model Integration

Overview

Agent System

Tools and Skills

Online Model Integration

Common Pitfalls

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Capability evidence risk: Capability evidence risk requires verification

2. Maintenance risk: Maintenance risk requires verification

3. Security or permission risk: Security or permission risk requires verification

4. Security or permission risk: Security or permission risk requires verification

5. Maintenance risk: Maintenance risk requires verification

6. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence