Doramagic Project Pack · Human Manual
LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
LazyLLM Overview and System Architecture
Related topics: Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration
LazyLLM Overview and System Architecture
Purpose and Scope
LazyLLM is a low-code development tool for building multi-agent large language model (LLM) applications. It enables developers to assemble complex AI applications from reusable modules, flows, and components without deep knowledge of LLM infrastructure, prompt engineering, or deployment plumbing. Source: README.md.
The project positions itself around four design pillars documented in the README:
- Convenient AI Application Assembly — Lego-like composition of agents, data flows, and functional modules.
- One-Click Deployment — Lightweight gateway during POC, and one-click image packaging for production.
- Cross-Platform Compatibility — Single code path across bare-metal, dev machines, Slurm clusters, and public clouds.
- Unified User Experience — A single API surface for online (OpenAI, SenseNova, Kimi, ChatGLM, etc.) and locally deployed models. Source: README.md.
The application development lifecycle follows a prototype → data feedback → iterative optimization loop, where LazyLLM aims to support each stage — from rapid prototyping through fine-tuning and production deployment. Source: README.md.
High-Level System Architecture
LazyLLM is organized as a layered system. At the top, the CLI (lazyllm command) provides entry points for installation, deployment, running, skills, and code review. Source: lazyllm/cli/main.py. Below it, the Module system and Flow system form the core abstractions for building applications, while Agents, RAG infrastructure, and Deployment infrastructure sit on top as higher-level building blocks.
graph TB
subgraph CLI["CLI Layer (lazyllm cli/main.py)"]
Install["install"]
Deploy["deploy"]
Run["run"]
Skills["skills"]
Review["review / review-local"]
end
subgraph Core["Core Abstractions"]
Modules["Module System<br/>(ModuleBase, TrainableModule,<br/>OnlineChatModule, etc.)"]
Flows["Flow System<br/>(Pipeline, Parallel,<br/>Loop, IFS, Warp)"]
end
subgraph HighLevel["High-Level Components"]
Agents["Agent System<br/>(ReactAgent, PlanAndSolveAgent,<br/>ReWOOAgent, FunctionCall)"]
RAG["RAG Infrastructure<br/>(Document, Retriever,<br/>Reranker, Splitter)"]
DeployInfra["Deployment Infrastructure<br/>(ServerModule, WebModule,<br/>TrainableModule)"]
end
subgraph Backends["Backend Integrations"]
LocalModels["Local Inference<br/>(lightllm, vllm)"]
OnlineModels["Online Providers<br/>(OpenAI, SiliconFlow,<br/>MiniMax, SenseNova, ...)"]
Storage["Storage<br/>(Elasticsearch, OceanBase,<br/>Milvus, ChromaDB)"]
end
CLI --> Core
Core --> HighLevel
HighLevel --> BackendsCore Abstractions: Modules and Flows
Module System
The Module system is the foundational abstraction. LazyLLM provides a structured taxonomy of module types, each combining training, fine-tuning, serving, and deployment capabilities. Source: README.md.
| Module Type | Purpose | Train | Fine-tune | Serve | Deploy |
|---|---|---|---|---|---|
ModuleBase | Wrap any callable into a Module | — | — | — | — |
ActionModule | Trainable & deployable wrapper | ✅ | ✅ | ✅ | ✅ |
UrlModule | Wrap external URLs as Modules | ❌ | ❌ | ✅ | ✅ |
ServerModule | Wrap any callable as an API service | ❌ | ✅ | ✅ | ✅ |
TrainableModule | Base for all supported models | ✅ | ✅ | ✅ | ✅ |
WebModule | Multi-round dialogue interface | ❌ | ✅ | ❌ | ✅ |
OnlineChatModule | Online chat (training + inference) | ✅ | ✅ | ✅ | ✅ |
OnlineEmbeddingModule | Online embedding inference | ❌ | ✅ | ✅ | ✅ |
Source: README.md.
Flow System
Flows describe how data is passed between callable objects. LazyLLM ships with predefined flow primitives: Pipeline, Parallel, Diverter, Warp, IFS, and Loop. These can be composed recursively with Modules, Components, or any Python callable. The flow abstraction makes it simple to add, replace, and reorganize components without rewriting application code. Source: README.md.
Agent Subsystem
The lazyllm/tools/agent/ directory implements LazyLLM's Agent system. Source: lazyllm/tools/agent/AGENTS.md.
Core Agent Types
| File | Agent |
|---|---|
base.py | LazyLLMAgentBase — common base for all agents |
functionCall.py | FunctionCall / FunctionCallAgent — single-turn tool-call execution |
reactAgent.py | ReactAgent — ReAct loop agent |
planAndSolveAgent.py | PlanAndSolveAgent — plan-then-execute agent |
rewooAgent.py | ReWOOAgent — blueprint + evidence + answer agent |
toolsManager.py | ToolManager, ModuleTool, register — tool registration |
skill_manager.py | SkillManager — workflow-style skill management |
Source: lazyllm/tools/agent/AGENTS.md.
ReAct Loop Flow
ReactAgent wraps FunctionCall in a Loop with a stop condition. On each iteration the LLM is asked to reason and either emit tool calls or a final string answer:
- The agent builds history messages and injects them into
locals['_lazyllm_agent']['workspace']. Source: lazyllm/tools/agent/AGENTS.md. - LLM output is parsed in
_post_action. Iftool_callsare present, theToolManagerexecutes them and returns adict, continuing the loop. Otherwise astris returned, triggering the loop's stop condition (isinstance(x, str)). Source: lazyllm/tools/agent/AGENTS.md.
The base agent LazyLLMAgentBase accepts parameters for LLM, tools, max retries, streaming, return-trace, skills, memory, sandbox, and file-system access. Source: lazyllm/tools/agent/base.py.
Tool Registration
Tools can be registered by inheriting ModuleTool (recommended for complex tools) — the class reads the apply method's docstring, type hints, and signature to construct an LLM-callable schema. Source: lazyllm/tools/agent/toolsManager.py.
Skill Management
SkillManager manages reusable workflows ("skills") with the get_skill / read_reference / run_script tool trio. The skill system enforces strict rules: agents must call get_skill first to retrieve SKILL.md, and reference/script paths must be copied verbatim from the skill's documentation — fabricated paths are forbidden. Source: lazyllm/tools/agent/skill_manager.py.
Command-Line Interface
The lazyllm CLI exposes five subcommands routed in lazyllm/cli/main.py:
lazyllm install [...]— install dependencies for a model/project. Source: lazyllm/cli/main.py.lazyllm deploy <model> [...]— deploy an LLM service (e.g., VLLM deployments support restricted parameters; bypass viaLAZYLLM_VLLM_SKIP_CHECK_KW=True). Source: lazyllm/cli/README.md.lazyllm run [...]— run a project.lazyllm skills <list|info|delete|add|import|install> [...]— manage skills, including installing them into a project or agent. Source: lazyllm/cli/skills.py.lazyllm review --pr <number> [...]andlazyllm review-local [...]— multi-round AI code review for GitHub PRs or local git branches; the local variant diffs against a base branch viagit merge-baseand writes a JSON report. Source: lazyllm/cli/review.py.
RAG and Data Subsystems
LazyLLM integrates a complete RAG stack that includes:
- Engineering: Horizontal scaling of RAG modules, multi-knowledge-base Q&A, and LazyRAG integration (V0.7). Source: README.md.
- Data Capabilities: Table parsing, CAD image parsing, and pretrain data processing. Source: README.md.
- Algorithm Capabilities: Structured-text processing (CSV), multi-hop retrieval, information-conflict handling, and agentic-RL problem solving. Source: README.md.
A typical RAG pipeline uses Document with Retriever, Reranker, and SentenceSplitter components wired together through pipeline and parallel flows. The README demonstrates online deployments combining OnlineEmbeddingModule with cosine/B M25 retrievers and a ModuleReranker. Source: README.md.
Prompt Templates and Data Lineage
LazyLLM ships a curated set of prompt templates in lazyllm/prompt_templates/prompts_actor/. The project tracks data lineage and licensing for these resources:
- awesome-chatgpt-prompts-zh.json (124 Chinese prompts) — MIT licensed, sourced from PlexPt/awesome-chatgpt-prompts-zh, lightly reformatted. Source: lazyllm/prompt_templates/prompts_actor/README.md.
- prompts.chat.json (1192 English prompts) — CC0-1.0 licensed, sourced from f/prompts.chat, with normalization and duplicate removal. Source: lazyllm/prompt_templates/prompts_actor/README.md.
"Lightly modified" in this context means key-name normalization, whitespace fixes, and minor wording adjustments — no wholesale rewriting of original content. Source: lazyllm/prompt_templates/prompts_actor/README.md.
Roadmap and Recent Milestones
Per the v0.7.1 release notes (current latest stable referenced in community context), recent milestones include:
- Agent Module Refactor — major rewrite for maintainability.
- New storage providers — Elasticsearch, OceanBase.
- New online model providers — SiliconFlow, MiniMax.
- Comprehensive caching system for performance gains.
- Document parsing service and startup system refactors.
Source: Community release notes.
Open community feature requests (e.g., interleaved text+image content for OnlineModule(type='image_editing') in issue #1035) indicate ongoing evolution of online module capabilities, while documentation build issues (e.g., issue #655) reflect active investment in tutorial and learning material quality.
See Also
- Agent System and Tool Registration
- CLI Reference
- RAG Pipeline Guide
- Module and Flow Reference
Source: https://github.com/LazyAGI/LazyLLM / Human Manual
Components, Modules, and Flows
Related topics: LazyLLM Overview and System Architecture, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LazyLLM Overview and System Architecture, RAG Pipeline, Document Processing, and Stores, Agents, Tools, Memory, and Online Model Integration
Components, Modules, and Flows
Overview
LazyLLM is a framework for building AI applications by composing reusable units. The project organizes its building blocks into three primary abstractions: Components (low-level utilities such as model downloaders and prompt templates), Modules (high-level wrappers that encapsulate models, services, and callable logic), and Flows (data-stream primitives that connect Modules and Components into executable graphs). Together they let developers "wrap functions, modules, flows, etc., into a Module" and assemble multi-agent applications with a Lego-like experience (README.md).
The framework emphasizes four goals that shape its design (README.md):
- Convenient assembly — pipelines can be expressed declaratively with Flows.
- One-click deployment — Modules can be promoted to services without rewriting.
- Cross-platform compatibility — the same code runs on bare-metal, Slurm, and public clouds.
- Unified experience — online and local model providers share a single interface.
Module Hierarchy
Modules in LazyLLM are typed wrappers. The README documents the canonical set and the capabilities each one offers (README.md):
| Module | Purpose | Training | Fine-tune | Deploy |
|---|---|---|---|---|
| UrlModule | Wraps any URL into a Module to access external services | ❌ | ❌ | ✅ |
| ServerModule | Wraps any function, flow, or Module into an API service | ❌ | ✅ | ✅ |
| TrainableModule | Trainable Module; all supported models are TrainableModules | ✅ | ✅ | ✅ |
| WebModule | Launches a multi-round dialogue interface service | ❌ | ✅ | ❌ |
| OnlineChatModule | Integrates online model fine-tuning and inference services | ✅ | ✅ | ✅ |
| OnlineEmbeddingModule | Integrates online Embedding model inference services | ❌ | ✅ | ✅ |
These Modules are composed of lower-level Components, such as model_mapping.py, which maps model identifiers to Hugging Face / ModelScope namespaces and to model-specific prompt keys (sos, soh, soa, stop_words, etc.) for chat-template construction (lazyllm/components/utils/downloader/model_mapping.py). For example, the deepseek entry defines sos: '<|begin▁of▁sentence|>' and stop_words: ['<|end▁of▁sentence|>'] so that prompts and stop tokens are produced automatically (lazyllm/components/utils/downloader/model_mapping.py).
Flow System
A Flow is a data-stream primitive: it describes how a value is passed from one callable object to another. According to the project README, LazyLLM ships with Pipeline, Parallel, Diverter, Warp, IFS, and Loop flows, which together "can cover almost all application scenarios" (README.md). Flows are the mechanism by which complex graphs are assembled from Modules and Components without manual plumbing.
The Loop primitive is also the workhorse of the agent system: ReactAgent wraps FunctionCall inside a Loop, with a stop condition that fires when FunctionCall returns a str (final answer) instead of a dict (tool calls) (lazyllm/tools/agent/AGENTS.md). This means the same Loop abstraction is reused for both data-flow graphs and agent reasoning loops.
Agent Subsystem
Agents are first-class Modules that combine an LLM with a tool registry. The framework ships four agent implementations, each suited to a different reasoning style (lazyllm/tools/agent/AGENTS.md):
ReactAgent— Reason→Act→Observe loop; the default for general multi-step tool use.PlanAndSolveAgent— Planner decomposes a task; Solver executes the plan.ReWOOAgent— Planner emits a blueprint; Workers collect evidence in parallel; Solver returns the answer.FunctionCallAgent— Deprecated single-shot tool caller; superseded byReactAgent.
All four share FunctionCall as their inner execution unit. A single round follows this pattern (lazyllm/tools/agent/AGENTS.md):
flowchart TD
A[Input] --> B[_build_history]
B --> C[LLM reasoning]
C --> D{tool_calls?}
D -- yes --> E[ToolManager.execute]
E --> F[dict: continue Loop]
D -- no --> G[str: stop Loop]The ReactAgent prompt template encodes the same loop explicitly: "Reason → Act → Observe → Reflect", with a hard rule of "at most one tool per action step" and a final-answer rule that breaks out of the loop (lazyllm/tools/agent/reactAgent.py). A _FORCE_SUMMARIZE_MSG is injected when the agent exhausts max_retries, telling the LLM to "Stop calling tools now and provide your final answer immediately" (lazyllm/tools/agent/reactAgent.py).
Tools are registered through ModuleTool and ToolManager. ModuleTool parses the function's docstring and type hints to build a Pydantic schema for the LLM, raising an error if the docstring return type and the Python return annotation disagree (lazyllm/tools/agent/toolsManager.py). When variable-argument functions are used, the schema falls back to the docstring types rather than the runtime signature (lazyllm/tools/agent/toolsManager.py). Built-in tools such as write_file are registered through the @register('builtin_tools', ...) decorator (lazyllm/tools/agent/file_tool.py).
Beyond built-in tools, users can install external Skills from GitHub with install_skill (lazyllm/tools/agent/skill_hub.py). The skill hub fetches the repository file tree via the Git Trees API, locates a SKILL.md, and exposes the skill's workflow to the agent. The skill manager's prompt enforces a strict prerequisite: read_reference and run_script may only be called after the agent has fetched the skill's SKILL.md, and rel_path values must be copied verbatim from that file (lazyllm/tools/agent/skill_manager.py).
CLI Surface
The framework exposes a unified CLI for the full lifecycle. lazyllm deploy starts a model service (e.g. lazyllm deploy llama2 --tp=2) (lazyllm/cli/README.md), and the top-level dispatcher routes install, deploy, run, skills, review, and review-local subcommands (lazyllm/cli/main.py). The review subcommand performs multi-round AI code review on a local repository, diffing the current branch against a base using git merge-base and writing the result to JSON (lazyllm/cli/review.py).
Community Notes
- Feature parity for image editing — Issue #1035 reports that
OnlineModule(type='image_editing')lacks interleaved text+image content support, an example of the kind of capability gap that flows through the OnlineChatModule/OnlineEmbeddingModule table above. - Documentation rendering bugs — Issue #655 notes that several tutorial page headings fail to compile, which has a direct impact on discoverability of the Flow and Module APIs documented here.
- Release v0.7.1 — The release notes flag a major Agent-module refactor — a relevant heads-up for anyone tracking the
ReactAgent/FunctionCallcode paths cited above.
See Also
- README.md — top-level project overview.
- lazyllm/tools/agent/AGENTS.md — agent internals.
Source: https://github.com/LazyAGI/LazyLLM / Human Manual
RAG Pipeline, Document Processing, and Stores
Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, Agents, Tools, Memory, and Online Model Integration
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, Agents, Tools, Memory, and Online Model Integration
RAG Pipeline, Document Processing, and Stores
Overview
LazyLLM provides a first-class Retrieval-Augmented Generation (RAG) stack that combines a Document index, pluggable node groups (splits), Retrievers, and Rerankers into a Flow-compatible pipeline. The RAG subsystem targets three goals: (1) support 20+ splitting strategies and many document types, (2) horizontally scale across multiple knowledge bases and machines, and (3) integrate at least one open-source knowledge-graph framework. Source: README.md
The v0.7.1 release expanded the storage ecosystem with Elasticsearch and OceanBase backends, added SiliconFlow and additional online providers, and refactored the document parsing service and launcher systems for better maintainability. The release also introduced a comprehensive caching layer that accelerates repeated RAG queries. Source: README.md
Architecture and Data Flow
A RAG application in LazyLLM is composed of four cooperating layers:
flowchart LR
A[Raw files / URL] --> B[Document Parser]
B --> C[Node Groups<br/>Sentences / CoarseChunk / KB]
C --> D[Embedding / BM25 Index]
D --> E[Retriever]
E --> F[Reranker]
F --> G[LLM Prompt + Answer]- Document owns the dataset, parsers, and one or more node groups Source: lazyllm/tools/rag/document.py
- Node groups are transformed views of the document (e.g.
Sentences,CoarseChunk, knowledge-graph triples) Source: lazyllm/tools/rag/doc_node.py - Retriever is a callable that queries a node group with a similarity function Source: lazyllm/tools/rag/retriever.py
- Reranker reorders retrieved nodes before they are passed to the LLM Source: lazyllm/tools/rag/rerank.py
A canonical end-to-end pipeline (from the project README) wires these layers with pipeline and parallel Flows:
import lazyllm
from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker
documents = Document(
dataset_path="your data path",
embed=lazyllm.OnlineEmbeddingModule(),
manager=False,
)
documents.create_node_group(
name="sentences",
transform=SentenceSplitter,
chunk_size=1024,
chunk_overlap=100,
)
with pipeline() as ppl:
with parallel().sum as ppl.prl:
prl.retriever1 = Retriever(documents, group_name="sentences",
similarity="cosine", topk=3)
prl.retriever2 = Retriever(documents, "CoarseChunk",
"bm25_chinese", 0.003, topk=3)
ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) \
| bind(query=ppl.input)
ppl.formatter = (lambda nodes, query: dict(
context_str="".join([node.get_content() for node in nodes]), query=query)) \
| bind(query=ppl.input)
ppl.llm = lazyllm.OnlineChatModule(stream=False).prompt(
lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))
Source: README.md:0-0
Document Processing
Document is the central entry point. It accepts a dataset_path (a local directory or a URL when used in client mode), an embed module, and an optional manager flag. The manager flag controls whether a built-in DocServer and UI are spawned. Source: examples/rag_with_parsing_service/README.md
Node groups and splitting strategies
LazyLLM exposes splitting strategies through transform callables. The default SentenceSplitter accepts chunk_size and chunk_overlap. Beyond sentence-level splits the system supports structured strategies such as CoarseChunk (used for BM25 retrieval in the demo) and a knowledge-graph extractor, with the stated goal of supporting "no less than 20 types" of splitters across the v0.6–v0.8 roadmap. Source: README.md
Standalone parsing service
For high-throughput or multi-process deployments, the parser can be detached into a service. DocumentProcessor(url=...) points a Document at a remote parser, disables local file-change monitoring, and requires a persistent store_conf (a pure in-memory map store cannot be shared across processes — use OpenSearch, Milvus, Elasticsearch, OceanBase, etc.). The example ships three scripts:
| Script | Purpose |
|---|---|
server_with_worker.py | Run parser server + worker in one process |
server_and_separate_workers.py | Run parser server; start workers separately via DocumentProcessorWorker |
document.py | Register a Document with the parsing service |
retriever_using_url.py | Query the document remotely via its URL |
Source: examples/rag_with_parsing_service/README.md
Embedding and online modules
Document accepts any callable that conforms to the embedding contract. OnlineEmbeddingModule is the zero-setup choice, and additional online providers (SiliconFlow, etc.) were added in v0.7.1. Source: README.md
Stores and Indexes
Stores hold both raw segments and indexed vectors. The default indexer is implemented in default_index.py and supports vector similarity, BM25 keyword search, and knowledge-graph lookups. Source: lazyllm/tools/rag/default_index.py
| Backend | Type | Use case |
|---|---|---|
| Map (in-memory) | Vector / segment | Single-process demos; not for shared deployments |
| Milvus | Vector | Production vector search |
| OpenSearch | Vector + keyword | Hybrid search in distributed setups |
| Elasticsearch | Vector + keyword | Added in v0.7.1; horizontal scaling |
| OceanBase | Vector + keyword | Added in v0.7.1; SQL-compatible hybrid store |
Source: README.md, examples/rag_with_parsing_service/README.md
A common pitfall: when manager=False is combined with a remote parser, store_conf must not be a pure map store, because map stores have no persistence and cannot be shared across processes. Source: examples/rag_with_parsing_service/README.md
Retrievers and Rerankers
Retriever(documents, group_name, similarity, topk) queries a single node group. The similarity argument selects the algorithm — "cosine" for dense vectors, "bm25_chinese" (or "bm25") for keyword search, plus a similarity threshold such as 0.003. Multiple retrievers can be combined in parallel().sum to merge their hits. Source: lazyllm/tools/rag/retriever.py, README.md
Reranker(name, model, topk) wraps a model-based reranker. ModuleReranker uses a HuggingFace-compatible model such as bge-reranker-large; other registered backends plug in custom scorers. Because Rerankers accept and return node lists, they slot directly into a pipeline and can be bind-ed to the user query. Source: lazyllm/tools/rag/rerank.py
The v0.7.1 release also extended the RAG module with multi-hop retrieval (following links and references inside documents), information-conflict handling, AI Writer, and AI Review capabilities — these are exposed as additional retriever/reasoning components on top of the core pipeline. Source: README.md
Common Failure Modes and Gotchas
- Map store in distributed mode — causes silent data loss across workers; switch to Milvus, OpenSearch, Elasticsearch, or OceanBase. Source: examples/rag_with_parsing_service/README.md
- Parser URL unreachable — when
DocumentProcessor(url=...)cannot reach the parser, registration anddataset_pathmonitoring are disabled; verify the URL and that the worker has started. Source: examples/rag_with_parsing_service/README.md - Splitter mismatch — calling
Retrieverwith agroup_namethat does not exist on theDocumentraises immediately; always create the node group withcreate_node_groupfirst. Source: lazyllm/tools/rag/document.py - Top-K tuning — dense and BM25 retrievers typically return overlapping but non-identical hits; merging via
parallel().sumimproves recall but inflates tokens, so settopkon the reranker to keep the prompt bounded. Source: README.md
See Also
- Agents, Tools, and Skills — the agent layer that often consumes RAG retrievers as tools.
- CLI and Deployment —
lazyllm deployand theinstall/run/skillscommands for packaging RAG services. - Flows and Modules —
pipeline,parallel,bind, and theModuletable that definesTrainableModule,OnlineChatModule, etc.
Source: https://github.com/LazyAGI/LazyLLM / Human Manual
Agents, Tools, Memory, and Online Model Integration
Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LazyLLM Overview and System Architecture, Components, Modules, and Flows, RAG Pipeline, Document Processing, and Stores
Agents, Tools, Memory, and Online Model Integration
Overview
LazyLLM exposes a unified Agent surface that combines a set of reusable reasoning loops, a registry-based tool system, persistent memory and skills, and a pluggable online model layer. According to the project README, the framework targets "convenient AI application assembly" with one-click deployment and a consistent user experience across locally deployed and online models. Source: README.md:8-22.
The release notes for v0.7.1 highlight a "major change: Agent module refactor" and additions of new online model providers such as SiliconFlow and MiniMax, together with a comprehensive caching system. Source: README.md. Community issue #1035 reports that OnlineModule(type='image_editing') does not yet support interleaved text+image content, illustrating a known limitation of the online model integration layer.
Agent System
The Agent subsystem is implemented under lazyllm/tools/agent/. Per the directory's AGENTS guide, every concrete Agent inherits from LazyLLMAgentBase and delegates a single "reason + tool call" round to FunctionCall. Source: lazyllm/tools/agent/AGENTS.md:30-40.
Four Agent classes ship out of the box:
| Agent | Working method | Typical use case |
|---|---|---|
ReactAgent | Reason → Act → Observe loop until final answer | Multi-step tasks with tool use |
PlanAndSolveAgent | Planner decomposes subtasks; Solver executes | Tasks needing upfront planning |
ReWOOAgent | Planner generates a blueprint; Worker gathers evidence; Solver answers | Parallelizable evidence collection |
FunctionCallAgent | Direct tool selection (deprecated, prefer ReactAgent) | Simple tool calls |
Source: lazyllm/tools/agent/AGENTS.md:50-62.
ReactAgent wraps FunctionCall in a Loop, stopping when the output becomes a str (the final answer) and continuing while it remains a dict containing tool_calls. Source: lazyllm/tools/agent/AGENTS.md:18-30. The class prompt explicitly enforces "use at most one tool per action step" and "do not call any tools after you already have enough information to answer." Source: lazyllm/tools/agent/reactAgent.py:1-60.
The execution flow for one round is:
flowchart TD
A[input] --> B[_build_history<br/>injects workspace locals]
B --> C[LLM reasoning]
C --> D{has tool_calls?}
D -- yes --> E[ToolManager._execute_tool]
E --> F[returns dict<br/>continue Loop]
D -- no --> G[returns str<br/>stop Loop]Conversation history is stored in locals['_lazyllm_agent']['workspace'] rather than instance attributes so that concurrent requests do not leak history across users. Source: lazyllm/tools/agent/AGENTS.md:30-46.
Tools and Skills
The ToolManager owns tool registration, schema generation, and execution. It wraps user tools in ModuleTool and generates an OpenAI function-calling tools_description from each tool's docstring. Source: lazyllm/tools/agent/AGENTS.md:96-118.
A tool's docstring must follow a strict format — first-line short description, an Args: block, type annotations, and a Returns: block — or the LLM cannot generate a valid schema. Source: lazyllm/tools/agent/AGENTS.md:78-94. Tools can be registered either by inheriting ModuleTool or by passing plain callables, and they live in a temporary group (tmp_tool) that is discarded after the call.
Complementing transient tools, SkillManager provides persistent, named skills that an Agent can recall mid-conversation. Source: lazyllm/tools/agent/skill_manager.py:1-30. The skill_manager prompt mandates a strict prerequisite: an Agent must call get_skill to load SKILL.md *before* using read_reference or run_script, and the rel_path argument must be copied verbatim from that document — fabrication is explicitly forbidden. Source: lazyllm/tools/agent/skill_manager.py:14-42.
The CLI exposes skill operations through lazyllm skills ..., supporting init, list, info, add, delete, import, and install --agent. Source: lazyllm/cli/skills.py:1-40. Top-level commands such as install, deploy, run, skills, review, and review-local are dispatched in lazyllm/cli/main.py:18-32. The deploy subcommand can launch local model servers (for example via vLLM with tensor parallelism), and is governed by an allow-list governed by LAZYLLM_VLLM_SKIP_CHECK_KW. Source: lazyllm/cli/README.md:1-40.
Online Model Integration
LazyLLM unifies locally trained and hosted models behind the same Module API. The README documents OnlineChatModule (integrates online model fine-tuning and inference) and OnlineEmbeddingModule (online embedding inference), both of which support training, inference, deployment, and serving in the same way as their local counterparts. Source: README.md:58-72.
Per-model prompt tokens are stored in model_mapping.py, which defines prompt_keys (such as sos, soh, soa, eoa, stop_words, and system) for families including internlm, internlm2, chatglm3, glm-4, baichuan2, deepseek, and Llama-3. Source: lazyllm/components/utils/downloader/model_mapping.py:1-40.
Online provider configuration is sourced from ~/.lazyllm/config.json or environment variables such as LAZYLLM_OPENAI_API_KEY, as shown in the chatbot example in the README. Source: README.md:30-50. Memory itself is delivered as a built-in functional module that "supports memory capabilities," listed under Feature Modules. Source: README.md:96-108.
The prompt library shipped at lazyllm/prompt_templates/prompts_actor/ aggregates 124 Chinese prompts from awesome-chatgpt-prompts-zh (MIT) and 1192 English prompts from prompts.chat (CC0-1.0), lightly normalized to fit the project's schema. Source: lazyllm/prompt_templates/prompts_actor/README.md:1-26.
Common Pitfalls
- Bad tool docstrings. Tools without a properly structured
Args:block cannot be invoked correctly by the LLM. Source: lazyllm/tools/agent/AGENTS.md:90-94. - Fabricated skill paths.
read_referenceandrun_scriptmust use paths copied verbatim fromSKILL.md; any fabricated path violates the skill protocol. Source: lazyllm/tools/agent/skill_manager.py:18-30. - Online module limitations.
OnlineModule(type='image_editing')does not yet accept interleaved text+image content — see community issue #1035. - vLLM parameter gating. Custom vLLM flags are rejected unless
LAZYLLM_VLLM_SKIP_CHECK_KW=Trueis exported. Source: lazyllm/cli/README.md:14-30.
See Also
Source: https://github.com/LazyAGI/LazyLLM / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 6 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Capability evidence risk - Capability evidence risk requires verification.
1. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | https://github.com/LazyAGI/LazyLLM
2. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM
3. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | https://github.com/LazyAGI/LazyLLM
4. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | https://github.com/LazyAGI/LazyLLM
5. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM
6. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | https://github.com/LazyAGI/LazyLLM
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using LazyLLM with real data or production workflows.
- v0.7.1 - github / github_release
- Capability evidence risk requires verification - GitHub / issue
Source: Project Pack community evidence and pitfall evidence