# https://github.com/SylphAI-Inc/AdalFlow Project Manual

Generated at: 2026-06-26 23:31:14 UTC

## Table of Contents

- [Overview & Core Architecture](#page-1)
- [Agent, Runner & Model Integration](#page-2)
- [Auto-Optimization & Training](#page-3)
- [Retrieval, Tracing & Evaluation](#page-4)

<a id='page-1'></a>

## Overview & Core Architecture

### Related Pages

Related topics: [Agent, Runner & Model Integration](#page-2), [Auto-Optimization & Training](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)
- [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)
- [adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)
- [adalflow/adalflow/components/agent/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/README.md)
- [adalflow/adalflow/components/agent/agent.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py)
- [adalflow/adalflow/components/agent/react.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/react.py)
- [adalflow/adalflow/components/agent/prompts.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/prompts.py)
</details>

# Overview & Core Architecture

AdalFlow is a PyTorch-like library for building and auto-optimizing any large language model (LLM) workflow, from chatbots and RAG pipelines to autonomous agents. As stated in the top-level [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md), the framework unifies textual gradient optimization, few-shot bootstrapping, and instruction tuning into a single composable stack.

## Purpose and Scope

AdalFlow provides developers with two fundamental base classes — `Component` for the pipeline and `DataClass` for structured data interaction with LLMs — yielding minimal abstraction and maximum customizability. The project draws explicit inspiration from PyTorch, Micrograd, TextGrad, DSPy, OPRO, and PyTorch Lightning ([README.md: Acknowledgements](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)).

The library targets three primary use cases:

| Use Case | Description |
|----------|-------------|
| **Task Pipelines** | Chatbots, translation, summarization, code generation, classification, NER |
| **RAG Workflows** | Retrieval-augmented generation with custom retrievers and embedders |
| **Autonomous Agents** | Tool-using agents with planning, reflection, and multi-step reasoning |

Source: [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)

## High-Level Architecture

AdalFlow's architecture is organized in three concentric layers — a Core layer (Component, DataClass, Parameter, Generator), an Optimization layer (AdalComponent, Trainer, Textual Gradient Descent optimizers), and an Agent layer (ReAct planner + ToolManager).

```mermaid
graph TB
    subgraph Agent["Agent Layer"]
        A[Agent / ReAct]
        TM[ToolManager]
        A --> TM
    end
    
    subgraph Optimization["Optimization Layer"]
        T[Trainer]
        AC[AdalComponent]
        GD[Textual Gradient Descent]
        T --> AC
        AC --> GD
    end
    
    subgraph Core["Core Layer"]
        G[Generator]
        C[Component]
        DC[DataClass]
        P[Parameter]
        G --> C
        G --> P
        G --> DC
    end
    
    subgraph Providers["Model Providers"]
        MC[ModelClient]
        OAI[OpenAI]
        GRQ[Groq]
        AWS[AWS Bedrock]
        MC --> OAI
        MC --> GRQ
        MC --> AWS
    end
    
    Agent --> Optimization
    Optimization --> Core
    G --> MC
```

Source: [adalflow/adalflow/components/agent/agent.py:23-42](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py), [adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)

## Core Layer: Component, DataClass, Parameter, and Generator

The core layer mirrors PyTorch's design patterns:

- **`Component`** — the foundational building block for any pipeline node. It supports serialization, visualization, and composition (analogous to `nn.Module`).
- **`DataClass`** — provides typed schemas (`__{input/output}__fields`) for structured I/O between the pipeline and LLMs, inspired by DSPy ([README.md: Acknowledgements](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)).
- **`Parameter`** — wraps prompts, demonstrations, and other optimizable artifacts with a `param_type` (e.g., `ParameterType.PROMPT`, `ParameterType.DEMOS`) and a `requires_opt` flag. Prompt templates are rendered through Jinja and surfaced via `generator.print_prompt(...)` ([adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)).
- **`Generator`** — combines a `model_client`, a Jinja `Prompt` template, and `output_processors` (typically `JsonOutputParser`) to produce typed outputs. The same `Generator` can swap providers without changing pipeline logic.

```python
from adalflow.components.model_client import OpenAIClient
self.generator = Generator(
    model_client=OpenAIClient(),
    model_kwargs={"model": "gpt-3.5-turbo"},
    template=qa_template,
    prompt_kwargs={"output_format_str": parser.format_instructions()},
    output_processors=parser,
)
```

Source: [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)

## Optimization Layer: AdalComponent, Trainer, and Textual Gradients

The optimization layer borrows directly from PyTorch Lightning. Any class inheriting from `GradComponent` acts like a differentiable layer with `forward` and `backward` functions; it can be used as a loss function or as an optimizable node ([adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)).

The `Trainer` coordinates:

1. A **teacher LLM** that produces candidate prompts.
2. An **eval metric** (accuracy, BLEU, etc.) that becomes the loss signal.
3. One of several **textual gradient optimizers**, including `TGD` (Textual Gradient Descent), `TGD with Past Instructions` (à la OPRO), `BootstrapFewShot`, and `TSGD-M` (Textual Gradient Descent with Momentum). The loss is typically converted to textual feedback before being passed to the optimizer ([adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)).

## Agent Layer

The agent module ships two complementary implementations:

- **`Agent`** ([adalflow/adalflow/components/agent/agent.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py)) — a ReAct-style planner composed of a `Generator` (planner) plus a `ToolManager`. It tracks step history and supports thinking-model LLMs by toggling `include_fields` between `["thought", "name", "kwargs", "_is_answer_final", "_answer"]` and the thinking-model subset.
- **ReAct** ([adalflow/adalflow/components/agent/react.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/react.py)) — a slim variant that always emits `name`/`kwargs` and ends each run with a `finish` action, suitable for multi-hop retrieval.

Both consume a common prompt template ([adalflow/adalflow/components/agent/prompts.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/prompts.py)) that interleaves the role description, tool catalog, output schema, chat history, context variables, and step history.

## Known Failure Modes and Community Issues

Several community-reported issues align with the architecture's boundaries:

- **Provider-specific kwargs** — issue [#481](https://github.com/SylphAI-Inc/AdalFlow/issues/481) reports the quickstart Colab failing because OpenAI's Responses API (`o3-mini`) rejects `frequency_penalty`; this surfaces the need to filter `model_kwargs` per provider.
- **Optional `Function.args`** — issue [#479](https://github.com/SylphAI-Inc/AdalFlow/issues/479) describes a `TypeError: argument after * must be an iterable, not NoneType` raised when an LLM returns only `kwargs`; downstream code unpacks `*func.args` unconditionally.
- **MCP integration** — issue [#386](https://github.com/SylphAI-Inc/AdalFlow/issues/386) requests first-class Model Context Protocol support inside `ToolManager`, which is currently a manually-managed registry.
- **AWS Bedrock coverage** — issues [#201](https://github.com/SylphAI-Inc/AdalFlow/issues/201) and [#283](https://github.com/SylphAI-Inc/AdalFlow/issues/283) highlight gaps in credential setup and `modelId` mapping for the Bedrock client.
- **Exposed secrets** — issue [#489](https://github.com/SylphAI-Inc/AdalFlow/issues/489) flags credential-handling risk; users should prefer environment variables over in-repo secrets.

Source: [adalflow/CHANGELOG.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/CHANGELOG.md) (latest tagged release: v1.1.3).

## See Also

- [Tutorials](https://adalflow.sylph.ai/tutorials/index.html)
- [Class Hierarchy](https://adalflow.sylph.ai/tutorials/class_hierarchy.html)
- [Supported Model Clients](https://adalflow.sylph.ai/apis/components/components.model_client.html)
- [Supported Retrievers](https://adalflow.sylph.ai/apis/components/components.retriever.html)
- [API Reference](https://adalflow.sylph.ai/apis/index.html)

---

<a id='page-2'></a>

## Agent, Runner & Model Integration

### Related Pages

Related topics: [Overview & Core Architecture](#page-1), [Auto-Optimization & Training](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [adalflow/adalflow/components/agent/agent.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py)
- [adalflow/adalflow/components/agent/runner.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/runner.py)
- [adalflow/adalflow/components/agent/react.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/react.py)
- [adalflow/adalflow/components/agent/prompts.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/prompts.py)
- [adalflow/adalflow/components/agent/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/README.md)
- [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)
- [adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)
</details>

# Agent, Runner & Model Integration

AdalFlow's agent subsystem provides a ReAct-style autonomous agent, an orchestrating runner, and a model-agnostic integration layer that ties tool execution to LLM-driven planning. Together, these three pieces let developers build chatbots, RAG pipelines, and multi-step tool-using agents from a single `Component`-based interface. Source: [README.md]() and [adalflow/adalflow/components/agent/README.md]().

## Overview and Scope

The agent module implements the four canonical agent design patterns catalogued by the maintainers — Reflection, Tool use, Planning, and Multi-agent collaboration — but its runtime ships a ReAct (Reasoning + Acting) planner as the default. Source: [adalflow/adalflow/components/agent/README.md](). The release notes confirm that `v1.0.5a1` introduced the agent, runner, MCP tools, and an updated generator output contract. Source: [Release v1.0.5a1](https://github.com/SylphAI-Inc/AdalFlow/releases/tag/v1.0.5a1).

The three primary abstractions are:

| Abbreviation | Class | Responsibility |
|--------------|-------|----------------|
| Agent | `Agent` in `agent.py` | High-level ReAct loop, owns planner + tool_manager |
| Runner | `Runner` in `runner.py` | Drives one or more agents through a user query, manages streaming and results |
| Model Integration | `Generator` + `ModelClient` | Vendor-agnostic LLM/reasoning call interface |

## Agent Architecture

The `Agent` class is built from two collaborating components: a `Generator`-based planner and a `ToolManager`. The planner produces a structured `Function` call (name + kwargs + optional `_answer` / `_is_answer_final` flag) and the tool manager executes it. Source: [adalflow/adalflow/components/agent/agent.py]()`.

```mermaid
flowchart LR
    User[User Query] --> Runner
    Runner -->|step_history| Agent
    Agent --> Planner[Planner / Generator]
    Planner -->|Function JSON| Parser[JsonOutputParser]
    Parser -->|Function| Agent
    Agent -->|name + kwargs| TM[ToolManager]
    TM -->|Observation| Agent
    Agent -->|_is_answer_final?| Runner
    Runner -->|FinalAnswer| User
```

`create_default_planner` wires up the planner with a `JsonOutputParser` whose `data_class` is `Function`. When `is_thinking_model=True`, the parser skips the `thought` field and emits only `["name", "kwargs", "_is_answer_final", "_answer"]`; otherwise `thought` is also retained. Source: [adalflow/adalflow/components/agent/agent.py]().

The prompt template is defined in [adalflow/adalflow/components/agent/prompts.py](), which exports `default_role_desc` ("You are an excellent task planner.") and `adalflow_agent_task_desc`. The task description enforces two contracts: every intermediate step must call a tool from `<START_OF_TOOLS>...<END_OF_TOOLS>`, and termination is signalled by setting `_is_answer_final=True` together with a typed `_answer`. Source: [adalflow/adalflow/components/agent/prompts.py]().

## Runner and Execution Flow

`Runner` is a `Component` that wraps one or more agents and exposes both synchronous and streaming entry points. It imports tracing primitives (`runner_span`, `tool_span`, `step_span`, `response_span`) and event types such as `RunItemStreamEvent`, `ToolCallRunItem`, `FinalOutputItem`, and `RunnerStreamingResult`. Source: [adalflow/adalflow/components/agent/runner.py]().

A typical call drives the loop: feed the user query, capture the `Function` emitted by the planner, route it through the `ToolManager`, append the observation to `step_history`, and either continue or terminate. A `PermissionManager` mediates tool access, and `ConversationMemory` retains chat history between turns. Source: [adalflow/adalflow/components/agent/runner.py]().

The legacy `ReactAgent` lives in [adalflow/adalflow/components/agent/react.py]() and follows the same loop but exposes a coarser `include_fields = ["name", "kwargs", "thought"]` contract — useful as a reference for users migrating custom planners.

## Model Integration

AdalFlow is intentionally model-agnostic: every `Generator` accepts any `ModelClient` (OpenAI, Anthropic, Google, AWS Bedrock, Cohere, etc.) plus a `ModelType` enum that distinguishes chat LLMs from reasoning models. Source: [README.md]() and the documentation link "Supported Models" referenced in [adalflow/adalflow/README.md]().

The agent layer inherits this flexibility through three parameters: `model_client`, `model_kwargs`, and `model_type`. Reasoning/thinking models (e.g., `o3-mini`) are flagged via `is_thinking_model=True`, which strips `thought` from the output schema and avoids redundant chain-of-thought fields. Source: [adalflow/adalflow/components/agent/agent.py]().

Caching is plumbed through `cache_path` and `use_cache` on the planner `Generator`, and prompt-level optimization is opt-in via the `Parameter(role_desc=..., requires_opt=True)` wrapper around `task_desc` — making the agent prompts first-class targets of textual-gradient descent optimizers such as `TGD` or `SGD`. Source: [adalflow/adalflow/optim/README.md]().

## Known Issues and Limitations

Several open community issues map directly to this subsystem and are worth noting:

- **OpenAI Responses API incompatibility (#481)** — The quickstart Colab fails when the teacher generator targets `o3-mini`, because `Responses.create()` rejects `frequency_penalty`. Users on reasoning models must filter unsupported kwargs.
- **Function dataclass `args=None` (#479)** — When an LLM emits only kwargs, the deserialised `Function.args` is `None`, and downstream `*func.args` unpacking raises `TypeError`. Treat `args` as `Optional` everywhere it is consumed.
- **AWS Bedrock coverage (#201, #283)** — Bedrock model and embedding support is incomplete; maintainers note that credential setup and `modelId` mapping need polish.
- **MCP integration request (#386)** — Users have requested first-class Model Context Protocol support to expose MCP-server tools directly to the agent.
- **Timeouts (#474)** — Generic `TimeoutError: Request failed` errors are reported on macOS, typically downstream of provider-side latency rather than the agent loop.

## See Also

- [Core Components: Generator & ModelClient](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/core/generator.py)
- [Tool Manager](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/core/tool_manager.py)
- [Optimizers (TGD, SGD, Bootstrap)](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)
- [Releases](https://github.com/SylphAI-Inc/AdalFlow/releases)

---

<a id='page-3'></a>

## Auto-Optimization & Training

### Related Pages

Related topics: [Overview & Core Architecture](#page-1), [Retrieval, Tracing & Evaluation](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)
- [adalflow/adalflow/optim/parameter.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/parameter.py)
- [adalflow/adalflow/optim/optimizer.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/optimizer.py)
- [adalflow/adalflow/optim/gradient.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/gradient.py)
- [adalflow/adalflow/optim/text_grad/tgd_optimizer.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/text_grad/tgd_optimizer.py)
- [adalflow/adalflow/optim/text_grad/ops.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/text_grad/ops.py)
- [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)
- [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)
- [adalflow/adalflow/components/agent/agent.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py)
- [adalflow/adalflow/components/agent/react.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/react.py)
- [adalflow/adalflow/components/agent/prompts.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/prompts.py)
- [adalflow/adalflow/components/agent/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/README.md)
</details>

# Auto-Optimization & Training

## Overview and Purpose

Auto-optimization is a defining capability of AdalFlow: rather than hand-tuning prompts and demonstrations, developers declare task components and let the framework improve them automatically using textual gradients, few-shot bootstrapping, and instruction history. The repository describes itself as a PyTorch-like library to "build and auto-optimize any LM workflows, from Chatbots, RAG, to Agents," where the training loop mirrors PyTorch's `forward`/`backward` pattern but operates on text rather than tensors [Source: [README.md:1-50]()].

The auto-optimization subsystem is rooted in three foundational abstractions:

- **Parameter** — trainable text variables (prompts, instructions, few-shot demos) wrapped for backpropagation.
- **Generator / AdalComponent** — task pipeline layers that expose `forward` and `backward` semantics.
- **Trainer** — orchestrator that drives validation, loss evaluation, and optimizer steps.

The optim module explicitly acknowledges its inspiration: PyTorch for design patterns, Micrograd for the auto-differentiative architecture, TextGrad for textual gradient descent, DSPy for bootstrap few-shot optimization, and OPRO for past-instruction history [Source: [README.md:300-360]()].

## Core Abstractions: Parameter and GradComponent

The `Parameter` class encapsulates any optimizable text element — system prompts, role descriptions, or demonstrations — and tags it with metadata indicating whether it should be optimized. Within the ReAct agent, for example, the task description is wrapped as a `Parameter` with `requires_opt=True` so that the optimizer can mutate it during training [Source: [adalflow/adalflow/components/agent/react.py:80-110]()].

Any class inheriting from `GradComponent` is treated like a PyTorch layer: it has `forward` and `backward` methods and can participate in textual backpropagation. The optim module README clarifies the conceptual mapping: "In LLM applications, component will be used (1) layers to optimize with the usage of its parameters (2) used for its serialization and visualization ability" [Source: [adalflow/adalflow/optim/README.md:40-55]()]. The optimizers themselves are also `GradComponent` subclasses, meaning the entire training loop is composable and inspectable.

## Optimizer Variants

AdalFlow ships multiple optimizer strategies, each addressing a different optimization regime. The optim README enumerates the supported family:

| Optimizer | Strategy | Use Case |
|-----------|----------|----------|
| **BootstrapFewShot** | Generates and validates demonstrations from successful training traces | Few-shot in-context learning when labeled data exists |
| **TGD (Textual Gradient Descent)** | Asks an LLM to propose prompt edits conditioned on a textual gradient | Prompt tuning without demonstrations |
| **TSGD-M (TGD with Momentum)** | Samples and evaluates K prompts from historical cache, then generates new prompts from the best historical prompt + gradients | Stable optimization over long training runs |

The Loss function follows a textual-diff paradigm: rather than numeric distance, AdalFlow collects feedback from a batched `run`, then asks an LLM to either (1) act as the loss function or (2) convert raw accuracy into text feedback that can be backpropagated [Source: [adalflow/adalflow/optim/README.md:20-38]()]. This makes the framework's notion of "differentiable" fundamentally textual.

## Training Workflow

A typical AdalFlow training session follows the forward-backward-step pattern familiar from PyTorch Lightning:

```mermaid
flowchart LR
    A[Task Pipeline<br/>AdalComponent] --> B[forward<br/>batch_run]
    B --> C[Loss / Eval<br/>LLM-as-judge]
    C --> D[backward<br/>textual gradients]
    D --> E[Optimizer step<br/>TGD / TSGD-M / BootstrapFewShot]
    E --> F[Parameter updated<br/>prompt / demos / history]
    F --> A
```

During the **forward pass**, the pipeline executes `Generator` calls against the dataset. During the **backward pass**, gradients are textual strings describing why the output failed to match expectations. The optimizer then mutates the relevant `Parameter` objects. This loop is wrapped by `Trainer`, which AdalFlow borrows conceptually from PyTorch Lightning [Source: [README.md:310-320]()].

Community-reported issues illustrate both the power and fragility of this loop. Issue #382 documents a `ValueError` traceback originating inside `train()` on the Question Answering tutorial notebook, showing that optimizer-callable exceptions are surfaced directly to users [Source: [community issue #382](https://github.com/SylphAI-Inc/AdalFlow/issues/382)]. Issue #479 describes a related deserialization hazard: when an LLM emits only keyword arguments for a tool call, the `Function.args` field is `None`, and any downstream `*func.args` unpacking raises a `TypeError` — a reminder that training-time validation should tolerate partial tool-call payloads [Source: [community issue #479](https://github.com/SylphAI-Inc/AdalFlow/issues/479)]. Issue #481 reports that the quickstart Colab's `o3-mini` teacher generator fails because the OpenAI Responses API rejects `frequency_penalty`; this is propagated through the optimizer because the loss function is itself an LLM call [Source: [community issue #481](https://github.com/SylphAI-Inc/AdalFlow/issues/481)].

## Integration with Agents and Components

Auto-optimization extends to agents, not just single-call pipelines. The `Agent` class decomposes into a `Planner` (a `Generator`) and a `ToolManager`, with the planner's task description registered as a `Parameter` ready for optimization [Source: [adalflow/adalflow/components/agent/agent.py:30-90]()]. The ReAct variant similarly registers `react_agent_task_desc` as an optimizable `Parameter`, enabling automatic prompt tuning of multi-step reasoning agents [Source: [adalflow/adalflow/components/agent/react.py:95-130]()]. This design treats agents as optimizable task pipelines rather than monolithic black boxes, aligning with the README's vision of unifying chatbots, RAG, and agents under one auto-differentiative framework [Source: [adalflow/README.md:60-110]()].

## Common Failure Modes

When working with auto-optimization, expect these recurring failure categories:

- **Provider incompatibilities** — The Responses API rejects parameters such as `frequency_penalty` that older endpoints accepted. Always pin `model_kwargs` to provider-supported fields [Source: [community issue #481]()]().
- **Network timeouts** — Long training runs occasionally surface `TimeoutError: Request failed` when the loss-LLM endpoint stalls [Source: [community issue #474](https://github.com/SylphAI-Inc/AdalFlow/issues/474)]().
- **Tool-call deserialization** — Empty positional `args` break unpacking; validate that `Function.args` is non-`None` before spreading [Source: [community issue #479]()]().
- **Tutorial drift** — Notebooks can fall behind the released API; the latest tutorial revisions ship in the `docs/source/tutorials/` directory of the matching release tag.

## See Also

- [Design Philosophy](https://adalflow.sylph.ai/tutorials/lightrag_design_philosophy.html)
- [Class Hierarchy](https://adalflow.sylph.ai/tutorials/class_hierarchy.html)
- [Generator API](https://adalflow.sylph.ai/apis/components/components.generator.html)
- [Trainer & Optimizers Use Case](https://adalflow.sylph.ai/use_cases/question_answering.html)
- [Agent Component](https://adalflow.sylph.ai/apis/components/components.agent.html)
- [CHANGELOG.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/CHANGELOG.md)

---

<a id='page-4'></a>

## Retrieval, Tracing & Evaluation

### Related Pages

Related topics: [Overview & Core Architecture](#page-1), [Agent, Runner & Model Integration](#page-2), [Auto-Optimization & Training](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [adalflow/adalflow/components/retriever/bm25_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/bm25_retriever.py)
- [adalflow/adalflow/components/retriever/faiss_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/faiss_retriever.py)
- [adalflow/adalflow/components/retriever/lancedb_retriver.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/lancedb_retriver.py)
- [adalflow/adalflow/components/retriever/postgres_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/postgres_retriever.py)
- [adalflow/adalflow/components/retriever/qdrant_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/qdrant_retriever.py)
- [adalflow/adalflow/components/retriever/llm_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/llm_retriever.py)
- [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md)
- [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md)
</details>

# Retrieval, Tracing & Evaluation

AdalFlow provides a unified retrieval layer that plugs into its PyTorch-like task pipeline. Retrieval is the bridge between raw document corpora and LLM prompts: a `Retriever` consumes a query, returns ranked documents, and the surrounding pipeline attaches them to the prompt template before the `Generator` invokes a model client. Tracing and evaluation sit on top of retrieval so that developers can audit which passages the model saw and score pipeline accuracy end-to-end.

## Purpose and Scope

The retrieval subsystem is part of the broader task pipeline described in the [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md): *"Light, Modular, and Model-Agnostic Task Pipeline"*. Each retriever is implemented as a subclass of `Component`, which means it inherits serialization, tracing, and parameter-registration for free. This makes a retriever pluggable into the `Trainer`/`AdalComponent` optimization loop the same way a `Generator` does.

The supported retrievers documented at [Supported Retrievers](https://adalflow.sylph.ai/apis/components/components.retriever.html) include BM25 (sparse lexical), FAISS (in-memory dense), LanceDB (columnar vector store), PostgreSQL (relational + `pgvector`), Qdrant (managed vector DB), and `LLMRetriever` (LLM-mediated retrieval). All of them are surfaced through a common `retrieve(query, top_k)` interface so that swapping backends is a one-line change in pipeline configuration.

## Retriever Architecture

Every retriever in AdalFlow follows the same `Component` contract: a `forward`/`__call__` method, optional `Parameter` slots for tuning, and a deterministic `serialize` output. Internally each implementation differs in two dimensions — index type (sparse vs. dense) and storage locality (in-process vs. server).

```mermaid
flowchart LR
    Q[User Query] --> R{Retriever}
    R --> B[BM25Retriever]
    R --> F[FAISSRetriever]
    R --> L[LanceDBRetriever]
    R --> P[PostgresRetriever]
    R --> Qd[QdrantRetriever]
    R --> LR[LLMRetriever]
    B --> D[Ranked Documents]
    F --> D
    L --> D
    P --> D
    Qd --> D
    LR --> D
    D --> G[Generator Prompt]
    G --> M[Model Client]
    M --> O[Parsed Output]
    O --> T[Trainer / Eval]
```

The diagram above mirrors the high-level pipeline shown in [README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/README.md) (`AdalFlow_task_pipeline.png`). The retriever step is the only place where ground-truth external knowledge is injected, which is why tracing is tightly coupled to it: a faulty retriever silently degrades accuracy even when the prompt and model are optimal.

## Per-Backend Behavior

| Backend | File | Index Strategy | Typical Use |
|---|---|---|---|
| BM25 | `bm25_retriever.py` | Sparse lexical ranking | Small corpora, no embedding cost |
| FAISS | `faiss_retriever.py` | In-memory dense vectors | Latency-sensitive prototyping |
| LanceDB | `lancedb_retriver.py` | Disk-backed columnar vectors | Large corpora with cheap persistence |
| PostgreSQL | `postgres_retriever.py` | Relational + optional `pgvector` | Existing SQL data, transactional RAG |
| Qdrant | `qdrant_retriever.py` | Managed vector service | Production multi-tenant deployments |
| LLM | `llm_retriever.py` | LLM-mediated retrieval | Hybrid/agentic pipelines |

The `LLMRetriever` is distinct from the others — it delegates the ranking decision to an LLM, which makes it suitable for the ReAct-style agent loop documented in [adalflow/adalflow/components/agent/agent.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/agent/agent.py). In that context the retriever becomes one of several `tools` registered with the `ToolManager`, and the planner invokes it conditionally based on the current step.

## Tracing and Evaluation

AdalFlow treats tracing as a first-class concern of the `Component` base class. Every `Retriever.__call__` records the query, the top-k documents returned, latency, and any embedding/model calls made. When wrapped in an `AdalComponent` and passed to a `Trainer`, these traces feed both the textual-gradient optimizers (e.g., `TextualGradientDescent` with momentum, described in [adalflow/adalflow/optim/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/README.md)) and the evaluation loop.

Evaluation in AdalFlow is split between:

1. **Per-step loss / feedback** — implemented via `GradComponent`, which exposes `forward` and `backward` so the same retriever output can be scored and back-propagated as text gradients.
2. **End-to-end metric scoring** — accuracy, exact-match, or LLM-as-judge, executed after `Trainer.fit` returns the optimized prompt + retriever configuration.

Because retrievers are also `Parameter`-aware, the optimizer can choose to tune the `top_k` value, the embedding model, or the system prompt that frames retrieved context — all without leaving the single `train()` entry point shown in the [adalflow/README.md](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/README.md) Quickstart.

## Common Failure Modes and Community-Reported Issues

Several issues reported by the community directly touch retrieval, tracing, or evaluation:

- **Notebook error on `train()` (#382, "Question Answer tutorial notebook error")** — a `ValueError` raised inside the training loop, typically originating from a mismatched `output_format_str` or a retriever returning `None` for missing documents. Verify that the retriever is constructed *before* `train()` is called and that `top_k` is reachable from the dataset vocabulary. Source: [adalflow/adalflow/components/retriever/bm25_retriever.py](https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/components/retriever/bm25_retriever.py).
- **`frequency_penalty` error on Responses API (#481)** — when the teacher model is the OpenAI `o3-mini` Responses endpoint, the optimizer traces fail because the parameter is not accepted by that endpoint. Disable or remap the parameter in `model_kwargs` before invoking the optimizer.
- **`Function.args` unpack bug (#479)** — when an agent's tool call is reconstructed from retriever-traced JSON, `Function.args` may be `None` if only kwargs were returned, causing `*func.args` to fail. Guard any unpack site with an `if func.args` check.
- **Caching optimization (#389)** — community request to cache retrieval results across prompt variants. Today this must be implemented manually by memoizing the retriever; a future optimizer should let `top_k` and the cache key become tunable `Parameter`s.
- **AWS Bedrock integration (#201, #283)** — first-class model client is missing, so pipelines that use Bedrock models cannot trace correctly through the evaluator. Use OpenAI- or Groq-compatible endpoints until the maintainer ships a `BedrockClient`.

## Usage Pattern

A minimal retrieval-augmented pipeline looks like:

```python
from adalflow.components.retriever import FAISSRetriever
from adalflow.core.component import Generator

retriever = FAISSRetriever(index_path="docs.faiss", top_k=5)
generator = Generator(
    template=qa_template,
    prompt_kwargs={"context": retriever, "output_format_str": parser.format_instructions()},
    output_processors=parser,
)
```

In `AdalComponent`-based training, the retriever is registered as a sub-component and the trainer traces every `retrieve()` call alongside the prompt that consumed it. This trace is what the textual-gradient optimizers operate on, so retriever misconfiguration is surfaced as a failing gradient rather than a silent accuracy drop.

## See Also

- [Quickstart tutorial (adalflow_quick_start.ipynb)](https://colab.research.google.com/drive/1_YnD4H4shzPRARvishoU4IA-qQuX9jHrT)
- [Design philosophy](https://adalflow.sylph.ai/tutorials/lightrag_design_philosophy.html)
- [Class hierarchy](https://adalflow.sylph.ai/tutorials/class_hierarchy.html)
- [Supported Retrievers API reference](https://adalflow.sylph.ai/apis/components/components.retriever.html)
- [Supported Model Clients](https://adalflow.sylph.ai/apis/components/components.model_client.html)

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: SylphAI-Inc/AdalFlow

Summary: Found 11 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

## 1. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SylphAI-Inc/AdalFlow/issues/489

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SylphAI-Inc/AdalFlow/issues/481

## 3. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SylphAI-Inc/AdalFlow/issues/474

## 4. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SylphAI-Inc/AdalFlow/issues/479

## 5. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/SylphAI-Inc/AdalFlow

## 6. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SylphAI-Inc/AdalFlow

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/SylphAI-Inc/AdalFlow

## 8. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/SylphAI-Inc/AdalFlow

## 9. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/SylphAI-Inc/AdalFlow/issues/475

## 10. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SylphAI-Inc/AdalFlow

## 11. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/SylphAI-Inc/AdalFlow

<!-- canonical_name: SylphAI-Inc/AdalFlow; human_manual_source: deepwiki_human_wiki -->