# https://github.com/microsoft/magentic-ui 项目说明书

生成时间：2026-05-18 02:19:23 UTC

## 目录

- [Getting Started with Magentic-UI](#getting-started)
- [Key Concepts](#key-concepts)
- [High-Level Architecture](#high-level-architecture)
- [Agent System](#agent-system)
- [Team Orchestration](#team-orchestration)
- [Backend API](#backend-api)
- [Frontend UI](#frontend-ui)
- [Browser Automation](#browser-automation)
- [Docker Containers](#docker-containers)
- [Configuration](#configuration)

<a id='getting-started'></a>

## Getting Started with Magentic-UI

### 相关页面

相关主题：[Configuration](#configuration), [Docker Containers](#docker-containers)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/microsoft/magentic-ui/blob/main/README.md)
- [TROUBLESHOOTING.md](https://github.com/microsoft/magentic-ui/blob/main/TROUBLESHOOTING.md)
- [CONTRIBUTING.md](https://github.com/microsoft/magentic-ui/blob/main/CONTRIBUTING.md)
- [frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)
- [pyproject.toml](https://github.com/microsoft/magentic-ui/blob/main/pyproject.toml)
</details>

# Getting Started with Magentic-UI

Magentic-UI is a Microsoft open-source project that provides a multi-agent framework for building AI-powered user interfaces. It enables developers to create intelligent agents that can browse the web, execute plans, handle file uploads, and interact with users through a web-based chat interface.

## Prerequisites

Before getting started with Magentic-UI, ensure your system meets the following requirements:

| Requirement | Version/Details |
|-------------|-----------------|
| Python | 3.10 or higher |
| Docker | Latest stable version |
| Node.js | For frontend development |
| pip | Latest version |

资料来源：[README.md:1-20](https://github.com/microsoft/magentic-ui/blob/main/README.md)

### System Dependencies

- **Docker**: Required for running agent containers that provide browser automation capabilities
- **Node.js**: Needed only if you plan to modify the frontend code

## Installation

### Standard Installation

The simplest way to install Magentic-UI is via pip:

```bash
pip install magentic-ui
```

资料来源：[README.md:25-30](https://github.com/microsoft/magentic-ui/blob/main/README.md)

### Installation with Fara-7B Support

To use the Fara-7B model locally, install with the fara extras:

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install magentic-ui[fara]
```

资料来源：[README.md:55-60](https://github.com/microsoft/magentic-ui/blob/main/README.md)

## Quick Start with Docker

After installation, start Magentic-UI using Docker:

```bash
magentic-ui --port 8081
```

> **Note**: Running this command for the first time will pull two Docker images required for the Magentic-UI agents. If you encounter problems, you can build them directly with:

```bash
cd docker
sh build-all.sh
```

Once the server is running, access the UI at [http://localhost:8081](http://localhost:8081).

资料来源：[README.md:30-45](https://github.com/microsoft/magentic-ui/blob/main/README.md)

## Local Development Setup

### Backend Development

For local backend development, clone the repository and set up the environment:

```bash
git clone https://github.com/microsoft/magentic-ui.git
cd magentic-ui
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
```

Run the development server:

```bash
magentic ui --port 8081
```

资料来源：[TROUBLESHOOTING.md:1-20](https://github.com/microsoft/magentic-ui/blob/main/TROUBLESHOOTING.md)

### Frontend Development

The frontend is located in the `frontend/` directory of the repository. To set up for development:

1. Navigate to the frontend directory:

```bash
cd frontend
```

2. Create environment configuration:

```bash
cp .env.default .env.development
```

3. Configure the API URL:

Edit `.env.development` and set:

```
GATSBY_API_URL=http://localhost:8081/api
```

资料来源：[frontend/README.md:1-15](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

### Connecting Frontend to Backend

The frontend makes requests to the backend API expecting responses at `http://localhost:8081/api`. Ensure `GATSBY_API_URL` is correctly set in your environment configuration.

资料来源：[frontend/README.md:10-12](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

## Using Fara-7B Locally

To run Magentic-UI with a local Fara-7B model:

### Step 1: Serve the Model

In a separate process, serve the Fara-7B model using vLLM:

```bash
vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto 
```

### Step 2: Create Configuration

Create a `fara_config.yaml` file with the following content:

```yaml
model_config_local_surfer: &client_surfer
  provider: OpenAIChatCompletionClient
  config:
    model: "microsoft/Fara-7B"
    base_url: http://localhost:5000/v1
    api_key: not-needed
    model_info:
      vision: true
      function_calling: true
      json_output: false
      family: "unknown" 
      structured_output: false
      multiple_system_messages: false

orchestrator_client: *client
```

资料来源：[README.md:60-80](https://github.com/microsoft/magentic-ui/blob/main/README.md)

## Core Features

### Web Surfer Agent

The web surfer agent enables browsing and interacting with web pages. Available actions include:

| Action | Description |
|--------|-------------|
| `key` | Performs key presses (Enter, Alt, Shift, Tab, etc.) |
| `type` | Types text into input fields |
| `mouse_move` | Moves cursor to specified pixel coordinates |
| `left_click` | Clicks the left mouse button |
| `scroll` | Scrolls the mouse wheel |
| `visit_url` | Navigates to a specified URL |
| `web_search` | Performs a web search |
| `history_back` | Goes back in browser history |
| `pause_and_memorize_fact` | Stores information for future reference |
| `wait` | Waits for specified seconds |
| `terminate` | Ends the current task |

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-60](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)

### Plans System

Magentic-UI supports creating and managing reusable plans:

- **Create Plans**: Build step-by-step task plans through the UI
- **Attach Plans**: Attach saved plans to queries for execution
- **Learn Plans**: Save successful conversation patterns as reusable plans
- **Import/Export**: Import existing plans or export your library

资料来源：[frontend/src/components/features/Plans/PlanCard.tsx:1-50](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)

### File Handling

The system supports:

- Arbitrary file uploads
- File preview and download
- Multiple file type support (images, documents, etc.)

资料来源：[frontend/src/components/common/filerenderer.tsx:1-80](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/filerenderer.tsx)

### MCP Server Integration

Magentic-UI supports Model Context Protocol (MCP) servers with multiple connection types:

| Connection Type | Description |
|-----------------|-------------|
| SSE | Server-Sent Events connection |
| Stdio | Standard input/output connection |
| JSON Config | JSON-based configuration file |

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-60](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)

## Project Structure

```
magentic-ui/
├── frontend/                    # React frontend application
│   ├── src/
│   │   ├── components/         # Reusable UI components
│   │   │   ├── features/       # Feature-specific components
│   │   │   ├── views/          # View components
│   │   │   └── common/         # Common utilities
│   │   └── pages/              # Page components
│   └── README.md               # Frontend development guide
├── src/
│   └── magentic_ui/            # Main Python package
│       └── agents/             # Agent implementations
├── docker/                      # Docker configuration
├── README.md                    # Main documentation
├── CONTRIBUTING.md              # Contribution guidelines
└── TROUBLESHOOTING.md           # Issue resolution guide
```

## Troubleshooting

### Common Issues

#### Port Already in Use

If port 8081 is occupied, either stop the existing service or use a different port:

```bash
magentic ui --port 8082
```

#### Virtual Environment Activation

If you installed in a virtual environment but it didn't activate:

```bash
deactivate
source .venv/bin/activate
magentic ui --port 8081
```

#### Wrong Package Installed

Ensure you installed `magentic-ui` (not the unrelated `magentic` package):

```bash
pip install magentic-ui
```

### Getting Help

If issues persist:

1. Double-check all prerequisites in the README
2. Search [GitHub Issues](https://github.com/microsoft/magentic-ui/issues) for similar problems
3. Open a new issue with:
   - Detailed problem description
   - System information (OS, Docker version)
   - Steps to replicate

资料来源：[TROUBLESHOOTING.md:1-50](https://github.com/microsoft/magentic-ui/blob/main/TROUBLESHOOTING.md)

## Contributing

We welcome community contributions:

1. **Find an Issue**: Browse [All Issues](https://github.com/microsoft/magentic-ui/issues) and look for `help-wanted` labeled items
2. **Fork and Clone**: Fork the repository and clone locally
3. **Create a Branch**: Use descriptive names (e.g., `fix/session-bug` or `feature/file-upload`)
4. **Write Code and Tests**: Include tests for new features
5. **Run Checks**: Before submitting PR, run:

```bash
poe check
```

6. **Submit PR**: Open against `main` branch and reference the issue number

### Top "Help Wanted" Issues

| Issue | Description |
|-------|-------------|
| [#132](https://github.com/microsoft/magentic-ui/issues/132) | Allow MAGUI to understand video and audio |
| [#128](https://github.com/microsoft/magentic-ui/issues/128) | Enable arbitrary file upload in UI |
| [#126](https://github.com/microsoft/magentic-ui/issues/126) | Add streaming of final answer and coder messages |
| [#123](https://github.com/microsoft/magentic-ui/issues/123) | Add unit tests |
| [#124](https://github.com/microsoft/magentic-ui/issues/124) | Allow websurfer to scroll inside containers |

资料来源：[CONTRIBUTING.md:1-40](https://github.com/microsoft/magentic-ui/blob/main/CONTRIBUTING.md)

## Architecture Overview

```mermaid
graph TD
    A[User Interface] --> B[Frontend React App]
    B --> C[Backend API]
    C --> D[Magentic-UI Agents]
    D --> E[Web Surfer Agent]
    D --> F[Plans Executor]
    D --> G[MCP Servers]
    E --> H[Docker Containers]
    H --> I[Browser Automation]
    F --> J[Task Execution]
    G --> K[External Tools]
```

## Configuration Reference

### Environment Variables (Frontend)

| Variable | Description | Default |
|----------|-------------|---------|
| `GATSBY_API_URL` | Backend API endpoint | `http://localhost:8081/api` |

### Docker Ports

| Port | Service |
|------|---------|
| 8081 | Main application (configurable) |
| 5000 | vLLM model server (Fara-7B) |

---

<a id='key-concepts'></a>

## Key Concepts

### 相关页面

相关主题：[High-Level Architecture](#high-level-architecture), [Agent System](#agent-system)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/magentic_ui/approval_guard.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/approval_guard.py)
- [src/magentic_ui/guarded_action.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/guarded_action.py)
- [src/magentic_ui/learning/learner.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/learning/learner.py)
- [frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)
- [frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)
- [frontend/src/components/features/Plans/LearnPlanButton.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/LearnPlanButton.tsx)
- [frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)
- [frontend/src/hooks/store.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/hooks/store.tsx)
</details>

# Key Concepts

Magentic-UI is an interactive UI framework that enables AI agents to execute tasks with human oversight. This document explains the fundamental concepts that power the system's architecture, including approval workflows, guarded actions, plan management, and the learning system.

---

## 1. Approval Guard System

The Approval Guard System is a security mechanism that ensures AI agents require explicit human authorization before executing sensitive or irreversible actions. It acts as a gatekeeper between agent decision-making and action execution.

### 1.1 Purpose and Scope

The approval guard intercepts actions proposed by agents and pauses execution until a human user provides approval or denial. This allows users to:

- Review actions before they are executed
- Modify parameters before approval
- Deny dangerous or unintended operations
- Maintain control over agent behavior in automated workflows

### 1.2 Architecture Overview

```mermaid
graph TD
    A[Agent Decision] --> B{Approval Required?}
    B -->|Yes| C[Approval Guard]
    B -->|No| D[Execute Action]
    C --> E[Pause Execution]
    E --> F[User Notification]
    F --> G{User Decision}
    G -->|Approve| H[Execute Action]
    G -->|Deny| I[Cancel Action]
    G -->|Modify| J[Update Parameters] --> H
```

### 1.3 Run Status States

The system uses a status-based workflow to track the state of agent runs. The following table documents the primary run states:

| Status | Description |
|--------|-------------|
| `created` | Run has been initialized but not started |
| `active` | Run is currently executing |
| `awaiting_input` | Run is paused waiting for user input |
| `paused` | Run has been paused by user or system |
| `completed` | Run finished successfully |
| `failed` | Run encountered an error |

资料来源：[frontend/src/components/views/chat/runview.tsx:1-50]()

### 1.4 Approval Guard Configuration

The approval guard supports different policy configurations:

| Policy | Behavior |
|--------|----------|
| `AutoApprove` | All actions are automatically approved |
| `RequireApproval` | All actions require explicit user approval |
| `HybridPolicy` | Some actions auto-approved, others require approval |

---

## 2. Guarded Actions

Guarded Actions represent the atomic operations that agents can perform. Each action has a type, parameters, and metadata about whether approval is required.

### 2.1 Action Structure

Each guarded action contains:

```python
{
    "action": str,           # Action type identifier
    "parameters": dict,      # Action-specific parameters
    "requires_approval": bool,  # Whether approval is needed
    "timestamp": datetime,  # When action was created
    "status": str            # pending, approved, denied, executed
}
```

### 2.2 Action Types

The system supports multiple action categories:

| Category | Examples | Description |
|----------|----------|-------------|
| Navigation | `visit_url`, `history_back` | Browser navigation actions |
| Input | `type`, `key`, `mouse_move` | User input simulation |
| Control | `scroll`, `left_click` | Mouse and scroll interactions |
| Search | `web_search` | Information retrieval |
| System | `wait`, `terminate`, `pause_and_memorize_fact` | System-level operations |

资料来源：[src/magentic_ui/guarded_action.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/guarded_action.py)

### 2.3 Action Execution Flow

```mermaid
sequenceDiagram
    participant Agent
    participant Guard as Approval Guard
    participant User
    participant Executor
    
    Agent->>Guard: Propose Action
    Guard->>Guard: Check Approval Requirement
    alt Requires Approval
        Guard->>User: Display Action for Review
        User->>Guard: Approve/Deny/Modify
        alt Denied
            Guard->>Agent: Action Cancelled
        else Approved
            Guard->>Executor: Execute Action
        end
    else No Approval Required
        Guard->>Executor: Execute Action
    end
    Executor-->>Agent: Action Result
```

---

## 3. Plans System

The Plans system enables users to create, save, and reuse task workflows. Plans consist of structured steps that guide agent behavior through complex multi-step tasks.

### 3.1 Plan Structure

A Plan consists of:

| Component | Type | Description |
|-----------|------|-------------|
| `id` | string | Unique identifier |
| `task` | string | High-level task description |
| `steps` | array | Ordered list of execution steps |
| `created_at` | datetime | Creation timestamp |
| `metadata` | object | Additional configuration |

资料来源：[frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)

### 3.2 Plan Execution

When a plan is attached to a query:

1. The plan is retrieved from the plan library
2. Steps are displayed to the user for confirmation
3. The agent follows the plan's structured workflow
4. Progress is tracked through the progress bar component

```mermaid
graph LR
    A[Create/Load Plan] --> B[Attach to Query]
    B --> C[Display Plan Steps]
    C --> D{User Approval?}
    D -->|Yes| E[Execute Plan]
    D -->|No| F[Cancel]
    E --> G[Track Progress]
    G --> H[Complete Task]
```

### 3.3 Plan Management

Plans can be managed through the PlanCard component:

- **Creation**: Users can create new plans from scratch
- **Editing**: Existing plans can be modified through a modal interface
- **Deletion**: Plans can be removed from the library
- **Attachment**: Plans can be attached to queries via the chat input

资料来源：[frontend/src/components/features/Plans/LearnPlanButton.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/LearnPlanButton.tsx)

---

## 4. Learning System

The Learning System allows Magentic-UI to learn reusable plans from completed conversations. This enables knowledge transfer between sessions and automation of recurring tasks.

### 4.1 Learn Plan Workflow

```mermaid
graph TD
    A[Completed Conversation] --> B[Learn Plan Button]
    B --> C[Extract Task Structure]
    C --> D[Generate Plan Steps]
    D --> E[User Review]
    E --> F[Save to Library]
    F --> G[Available for Reuse]
```

### 4.2 Learning Process

| Step | Description |
|------|-------------|
| 1 | Conversation completes with final answer |
| 2 | User clicks "Learn Plan" button |
| 3 | System analyzes conversation history |
| 4 | Extracts reusable workflow patterns |
| 5 | Presents plan for user confirmation |
| 6 | Saves plan to user's plan library |

资料来源：[src/magentic_ui/learning/learner.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/learning/learner.py)

### 4.3 Learn Plan Button States

The LearnPlanButton component has multiple states:

| State | Visual Indicator | Behavior |
|-------|------------------|----------|
| Loading | Spinner + "Learning Plan..." | Analyzing conversation |
| Disabled | Opacity 50% | No session or user ID |
| Ready | Blue button + "Learn Plan" | Ready to learn |
| Dark Mode | Blue-400 text | Light theme variant |

资料来源：[frontend/src/components/features/Plans/LearnPlanButton.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/LearnPlanButton.tsx)

---

## 5. Chat Interface Components

The chat interface serves as the primary interaction point between users and the agent system.

### 5.1 Chat Input Component

The ChatInput component handles user input and supports multiple input types:

| Feature | Description |
|---------|-------------|
| Text Input | Free-form text queries |
| File Attachment | Upload files via dropdown menu |
| Plan Attachment | Attach predefined plans |
| MCP Server Selection | Configure Model Context Protocol servers |
| Pause Control | Pause active runs |

资料来源：[frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)

### 5.2 Run View Component

The RunView component displays the current state of an agent run:

- Status indicators (icons for each run state)
- Approval buttons for pending actions
- Detail viewer for visual feedback
- Input response handling

```mermaid
graph TD
    A[User Input] --> B{ChatInput Component}
    B --> C{Run Status}
    C -->|awaiting_input| D[Show Input Request]
    C -->|paused| E[Show Pause Controls]
    C -->|active| F[Show Progress]
    D --> G[Approval Buttons]
    E --> H[Resume/Cancel]
    F --> I[Progress Bar]
```

资料来源：[frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)

### 5.3 Progress Bar

The progress bar visualizes task completion:

| Segment | Color | Description |
|---------|-------|-------------|
| Completed | Green (`#22c55e`) | Finished steps |
| Current | Magenta (`#861657`) | Active step |
| Remaining | Gray | Pending steps |

When a final answer is available, the progress bar displays at 100% with a "Task Completed" status indicator.

资料来源：[frontend/src/components/views/chat/progressbar.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/progressbar.tsx)

---

## 6. State Management

The frontend uses a Zustand-based store for centralized state management.

### 6.1 Store Structure

| State Category | Key Properties |
|----------------|----------------|
| Session | `session`, `sessions`, `connectionId` |
| Messages | `messages`, `setMessages` |
| Configuration | `version`, `setVersion` |
| Header | `title`, `breadcrumbs` |
| Agent Flow | `direction`, `showLabels`, `showGrid`, `showTokens` |

资料来源：[frontend/src/hooks/store.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/hooks/store.tsx)

### 6.2 Agent Flow Settings

Agent flow visualization settings:

| Setting | Default | Description |
|---------|---------|-------------|
| `direction` | `"TB"` | Flow layout direction (TB=Top-Bottom) |
| `showLabels` | `true` | Display node labels |
| `showGrid` | `true` | Show background grid |
| `showTokens` | `true` | Display token counts |
| `showMessages` | `true` | Show message nodes |
| `showMiniMap` | `false` | Enable minimap navigation |

---

## 7. MCP Server Integration

Magentic-UI supports Model Context Protocol (MCP) servers for extended agent capabilities.

### 7.1 Server Configuration Types

| Type | Use Case |
|------|----------|
| SSE | Server-Sent Events communication |
| Stdio | Standard I/O subprocess communication |
| JSON | Raw JSON configuration |

### 7.2 Server Management

Servers can be added, updated, and removed through the MCPConfigModal component. Each server configuration includes:

- Server name (alphanumeric, max 50 characters)
- Connection type selection
- Type-specific configuration parameters
- Validation for duplicates and required fields

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)

---

## 8. File Handling

The system supports file uploads and rendering through the file renderer component.

### 8.1 Supported File Operations

| Operation | Description |
|-----------|-------------|
| Upload | Attach files to messages |
| Preview | Display file thumbnails |
| Download | Retrieve uploaded files |
| Metadata | Extract and display file information |

### 8.2 File Card Component

File cards display uploaded files with:

- Icon representation based on file type
- Truncated filename with full name tooltip
- Download button for file retrieval
- Click handler for file preview

资料来源：[frontend/src/components/common/filerenderer.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/filerenderer.tsx)

---

## 9. Browser Automation

The web surfer module provides browser automation capabilities through the FARA (Firefox Automation with Remote Access) system.

### 9.1 Supported Browser Actions

| Action | Parameters | Description |
|--------|------------|-------------|
| `key` | `keys[]` | Perform key press sequences |
| `type` | `text`, `press_enter`, `delete_existing_text` | Type text input |
| `mouse_move` | `coordinate[x,y]` | Move cursor to position |
| `left_click` | `coordinate[x,y]` | Click at position |
| `scroll` | `pixels` | Scroll wheel (positive=up, negative=down) |
| `visit_url` | `url` | Navigate to URL |
| `web_search` | `query` | Execute web search |
| `history_back` | - | Navigate browser back |
| `wait` | `time` (seconds) | Wait for page changes |
| `terminate` | `status` | End task with status |

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)

### 9.2 Display Configuration

| Parameter | Description |
|-----------|-------------|
| `display_width_px` | Browser viewport width |
| `display_height_px` | Browser viewport height |
| `include_input_text_key_args` | Enable text input key arguments |

---

## 10. User Interface Layout

The application layout follows a structured component hierarchy.

### 10.1 Layout Structure

```mermaid
graph TD
    A[MagenticUILayout] --> B[ConfigProvider]
    A --> C[SessionManager]
    A --> D[Warning Banner]
    B --> C
    C --> E[Chat/RunView]
    C --> F[DetailViewer]
    E --> G[ChatInput]
    F --> H[Browser Modal]
    F --> I[Fullscreen Overlay]
```

### 10.2 Theme Support

The layout supports both light and dark themes through Ant Design's theme configuration system.

### 10.3 Warning Banner

A persistent disclaimer informs users about the system's limitations:

> Magentic-UI can make mistakes. Please monitor its work and intervene if necessary.

资料来源：[frontend/src/components/layout.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/layout.tsx)

---

<a id='high-level-architecture'></a>

## High-Level Architecture

### 相关页面

相关主题：[Agent System](#agent-system), [Team Orchestration](#team-orchestration), [Backend API](#backend-api)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [frontend/src/components/views/chat/chat.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chat.tsx)
- [frontend/src/components/views/chat/rendermessage.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/rendermessage.tsx)
- [frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)
- [frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)
- [frontend/src/components/views/chat/detail_viewer.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/detail_viewer.tsx)
- [frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)
- [frontend/src/components/features/Plans/PlanList.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanList.tsx)
- [frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)
- [src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)
</details>

# High-Level Architecture

## Overview

Magentic-UI is a web-based interface that enables users to create, manage, and execute AI-driven task automation workflows. The system combines a React-based frontend with a Python backend to provide an interactive chat interface where users can execute multi-step plans, browse the web autonomously, and manage reusable workflow templates.

The architecture follows a **client-server model** where:
- The **frontend** handles UI rendering, user interaction, and real-time display of task execution
- The **backend** processes AI requests, manages agent execution, and coordinates with external tools
- Communication occurs via RESTful API endpoints

## System Components

### Frontend Architecture

The frontend is a React application using TypeScript, organized into a modular component structure. It communicates with the backend API at `http://localhost:8081/api` as specified in the environment configuration.

资料来源：[frontend/README.md:1-7]()

```mermaid
graph TD
    A[User Browser] --> B[React Frontend]
    B --> C[Components]
    C --> D[Views/Chat]
    C --> E[Features]
    E --> F[Plans]
    E --> G[McpServersConfig]
    C --> H[Layout]
    B --> I[Backend API<br/>localhost:8081/api]
    I --> J[Python Backend]
    J --> K[AI Agents]
    J --> L[Team Manager]
```

### Core Frontend Components

| Component | File Path | Purpose |
|-----------|-----------|---------|
| `Chat` | `frontend/src/components/views/chat/chat.tsx` | Main chat interface container |
| `RunView` | `frontend/src/components/views/chat/runview.tsx` | Manages run status and detail viewer |
| `RenderMessage` | `frontend/src/components/views/chat/rendermessage.tsx` | Renders different message types |
| `ChatInput` | `frontend/src/components/views/chat/chatinput.tsx` | User input with file/plan attachment |
| `DetailViewer` | `frontend/src/components/views/chat/detail_viewer.tsx` | Browser view, screenshots, live tabs |
| `ProgressBar` | `frontend/src/components/views/chat/progressbar.tsx` | Task progress visualization |

资料来源：[frontend/src/components/views/chat/chat.tsx:1-50]()

## Frontend View Architecture

### Chat System

The chat system is the primary user interaction point. It manages:
- Display of conversation messages
- Real-time run status updates
- Progress tracking for multi-step tasks
- Detail viewer integration for visual feedback

```mermaid
graph LR
    A[ChatInput] -->|User Input| B[Chat Container]
    B --> C[RenderMessage]
    C -->|Multi-modal| D[PlanView]
    C -->|File| E[FileRenderer]
    B -->|Active Run| F[RunView]
    F --> G[DetailViewer]
    G --> H[Screenshots]
    G --> I[Live View]
    G --> J[BrowserModal]
```

The chat component handles multiple run states:
- `active` - Task is currently executing
- `awaiting_input` - Waiting for user response
- `paused` - Task temporarily paused
- `pausing` - Pause operation in progress

资料来源：[frontend/src/components/views/chat/chat.tsx:30-40]()

### Plan Management System

Plans are reusable workflow templates that define multi-step task sequences. The plan system consists of:

| Component | Function |
|-----------|----------|
| `PlanList` | Displays all saved plans with search/filter |
| `PlanCard` | Individual plan summary with quick actions |
| `PlanView` | Detailed plan editing and viewing |
| `LearnPlanButton` | Creates new plans from conversation history |

```mermaid
graph TD
    A[User Conversation] --> B[LearnPlanButton]
    B --> C[Plan Created]
    C --> D[PlanList]
    D --> E[PlanCard]
    E --> F[PlanView<br/>Edit/View]
    F --> G[Save Plan]
    G --> D
    A --> H[Attach Plan to Chat]
    H --> I[Execute Plan]
```

Plans can be attached to chat queries using the dropdown menu in `ChatInput`, allowing users to:
1. Create new empty plans
2. Import plans from JSON files
3. Search through existing plans
4. Execute attached plans with the current query

资料来源：[frontend/src/components/features/Plans/PlanCard.tsx:1-80]()
资料来源：[frontend/src/components/features/Plans/PlanList.tsx:1-50]()

### Detail Viewer System

The detail viewer provides visual feedback during task execution through multiple tabs:

| Tab | Purpose |
|-----|---------|
| Screenshots | Static screenshots captured during execution |
| Live | Real-time browser view via noVNC |
| Browser Modal | Full browser view in modal overlay |

The system supports control handover, allowing users to take control during autonomous browsing:

```mermaid
graph TD
    A[Agent Execution] --> B[DetailViewer]
    B --> C[ScreenshotsTab]
    B --> D[LiveTab]
    D --> E[noVNC Connection]
    E --> F[User Control<br/>Handover]
    F --> G[FullscreenOverlay]
    G --> H[User Input Response]
    H --> A
```

资料来源：[frontend/src/components/views/chat/detail_viewer.tsx:1-100]()
资料来源：[frontend/src/components/views/chat/runview.tsx:1-50]()

## MCP Server Configuration

Magentic-UI supports the Model Context Protocol (MCP) for extending functionality through external servers. The configuration modal supports three connection types:

| Connection Type | Description |
|-----------------|-------------|
| SSE | Server-Sent Events for streaming responses |
| Stdio | Standard input/output process communication |
| JSON Config | Direct JSON configuration import |

```typescript
interface MCPConfig {
  serverName: string;      // Unique identifier
  connectionType: 'sse' | 'stdio' | 'json';
  // Additional config based on type...
}
```

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-100]()

## Progress Tracking System

The progress bar component provides real-time feedback on task execution:

```mermaid
graph TD
    A[Progress Update] --> B{Has Final Answer?}
    B -->|Yes| C[100% Complete<br/>Green Bar]
    B -->|No| D[Calculate Percentage]
    D --> E[Current Step Highlight<br/>Magenta]
    E --> F[Remaining Steps<br/>Gray]
    C --> G[Status: Task Completed]
    F --> H[Status: Step X of Y]
```

The progress system tracks:
- `currentStep` - Current execution step index
- `totalSteps` - Total number of steps in the plan
- `plan.steps` - Array of step definitions with titles

资料来源：[frontend/src/components/views/chat/progressbar.tsx:1-80]()

## Message Rendering System

Messages are rendered based on their content type:

```mermaid
graph TD
    A[Raw Message] --> B{Parse Content}
    B --> C{Multi-Modal?}
    C -->|Yes| D[Map Each Item]
    C -->|No| E[Render as Text]
    D --> F{Is String?}
    F -->|Yes| G[Parse & Display]
    F -->|No| H[Skip/Empty]
    G --> I{Has Plan?}
    I -->|Yes| J[PlanView Component]
    J --> K[Final Output]
    H --> K
    E --> K
```

Supported content types:
- Plain text with markdown rendering
- Multi-step plans
- File attachments with download capability
- Code blocks

资料来源：[frontend/src/components/views/chat/rendermessage.tsx:1-100]()

## Web Surfer Agent

The web surfer agent enables autonomous web browsing. It uses a structured action system:

| Action | Purpose | Required Parameters |
|--------|---------|---------------------|
| `visit_url` | Navigate to URL | `url` |
| `web_search` | Search the web | `query` |
| `scroll` | Scroll page | `pixels` |
| `click` | Click element | `coordinate` |
| `type` | Input text | `text`, `coordinate` |
| `pause_and_memorize_fact` | Store information | `fact` |
| `wait` | Wait for page | `time` |
| `terminate` | End task | `status` |

```python
parameters = {
    "action": {
        "type": "string",
        "enum": ["visit_url", "web_search", "scroll", "click", ...]
    },
    "coordinate": {
        "description": "(x, y): The x and y coordinates for mouse actions",
        "type": "array"
    }
}
```

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-80]()

## Data Flow

### User Query Flow

```mermaid
sequenceDiagram
    participant User
    participant ChatInput
    participant Backend
    participant Agent
    participant DetailViewer
    
    User->>ChatInput: Submit query
    ChatInput->>Backend: POST /api/run
    Backend->>Agent: Execute task
    Agent->>DetailViewer: Stream screenshots
    Agent->>Backend: Progress updates
    Backend->>ChatInput: Status updates
    Agent->>Backend: Final response
    Backend->>User: Display result
```

### Plan Attachment Flow

```mermaid
graph TD
    A[User clicks attach] --> B[Dropdown shows plans]
    B --> C[Select plan]
    C --> D[PlanView Modal opens]
    D --> E[Confirm attachment]
    E --> F[Plan attached to input]
    F --> G[Submit with plan context]
```

## State Management

The frontend manages several key state objects:

```typescript
interface ChatState {
  currentRun: Run | null;
  runStatus: 'idle' | 'active' | 'paused' | 'awaiting_input';
  progress: {
    currentStep: number;
    totalSteps: number;
    plan?: Plan;
  };
  hasFinalAnswer: boolean;
}

interface PlanState {
  task: string;
  steps: Step[];
  created_at?: string;
}
```

## Security Considerations

The layout includes a disclaimer for user awareness:

> Magentic-UI can make mistakes. Please monitor its work and intervene if necessary.

Control handover features allow users to take control from autonomous agents during execution, ensuring human oversight of automated tasks.

资料来源：[frontend/src/components/layout.tsx:1-50]()

## Configuration

### Environment Variables

| Variable | Default | Purpose |
|----------|---------|---------|
| `GATSBY_API_URL` | `http://localhost:8081/api` | Backend API endpoint |
| API requests target | `http://localhost:8081/api` | Frontend-backend communication |

### Theme Support

The application supports light and dark themes using Ant Design's theme algorithm system, dynamically switching between `darkAlgorithm` and `defaultAlgorithm` based on user preference.

资料来源：[frontend/README.md:1-10]()
资料来源：[frontend/src/components/layout.tsx:1-30]()

---

<a id='agent-system'></a>

## Agent System

### 相关页面

相关主题：[High-Level Architecture](#high-level-architecture), [Team Orchestration](#team-orchestration), [Browser Automation](#browser-automation)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)
- [frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)
- [frontend/src/components/views/chat/chat.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chat.tsx)
- [frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)
- [frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)
- [frontend/src/components/common/markdownrender.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/markdownrender.tsx)
</details>

# Agent System

## Overview

The Agent System in Magentic-UI is a multi-agent orchestration framework that enables autonomous task execution through specialized agents. The system coordinates various agent types including web surfers, file surfers, coders, and user proxies to accomplish complex tasks requested by users.

## Architecture Overview

```mermaid
graph TD
    User[User Request] --> Orchestrator[Orchestrator Agent]
    Orchestrator --> WebSurfer[Web Surfer Agent]
    Orchestrator --> FileSurfer[File Surfer Agent]
    Orchestrator --> Coder[Coder Agent]
    Orchestrator --> UserProxy[User Proxy Agent]
    
    WebSurfer --> Browser[Browser Control]
    FileSurfer --> FileSystem[File System]
    Coder --> CodeExecution[Code Executor]
    UserProxy --> UserApproval[User Approval]
    
    Browser --> StateUpdate[State Update]
    FileSystem --> StateUpdate
    CodeExecution --> StateUpdate
    UserApproval --> StateUpdate
    
    StateUpdate --> Orchestrator
```

## Agent Types

### Web Surfer Agent

The Web Surfer Agent enables autonomous web browsing by controlling a browser instance. It handles various browsing actions including navigation, interaction, and information retrieval.

#### Supported Actions

| Action | Description | Required Parameters |
|--------|-------------|---------------------|
| `visit_url` | Navigate to a specific URL | `url` |
| `web_search` | Perform a web search | `query` |
| `left_click` | Click at coordinates | `coordinate` |
| `right_click` | Right-click at coordinates | `coordinate` |
| `mouse_move` | Move mouse to coordinates | `coordinate` |
| `type` | Type text into an element | `text`, `coordinate` |
| `scroll` | Scroll the page | `pixels` |
| `key` | Press keyboard keys | `keys` |
| `pause_and_memorize_fact` | Store information for later use | `fact` |
| `wait` | Wait for specified duration | `time` |
| `history_back` | Navigate back in browser history | - |
| `terminate` | End the browsing session | `status` |

#### Configuration Parameters

The Web Surfer Agent supports the following configuration options:

```python
{
    "display_width_px": int,      # Browser viewport width
    "display_height_px": int,    # Browser viewport height
    "include_input_text_key_args": bool  # Include type-specific arguments
}
```

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-80]()

### File Surfer Agent

The File Surfer Agent provides file system navigation and file content interaction capabilities. It allows agents to read, write, and manage files within the project workspace.

### Coder Agent

The Coder Agent handles code generation, analysis, and execution tasks. It works in conjunction with the orchestrator to implement requested functionality.

### User Proxy Agent

The User Proxy Agent acts as an intermediary between the autonomous agent system and human users. It handles:

- Requesting user confirmation for sensitive operations
- Presenting information that requires human judgment
- Managing user input during interactive sessions

## Agent Communication Flow

```mermaid
sequenceDiagram
    participant User
    participant Frontend
    participant Orchestrator
    participant Agent
    
    User->>Frontend: Submit Task
    Frontend->>Orchestrator: Send Request
    Orchestrator->>Agent: Delegate Subtask
    Agent->>Agent: Execute Action
    Agent-->>Orchestrator: Return Result
    
    alt Requires Approval
        Orchestrator->>User: Request Approval
        User-->>Orchestrator: Approval/Denial
    end
    
    Orchestrator-->>Frontend: Final Response
    Frontend-->>User: Display Result
```

## Task Execution Workflow

When a user submits a task through the chat interface, the system follows this execution model:

1. **Task Submission**: User enters a query via `ChatInput` component
2. **Agent Selection**: The orchestrator determines which agent(s) to invoke
3. **Execution**: Selected agents perform their designated actions
4. **Progress Tracking**: The system displays execution progress via `ProgressBar`
5. **State Updates**: Real-time updates are rendered via `RenderMessage`
6. **Completion**: Final results are presented with option to save plans

资料来源：[frontend/src/components/views/chat/chat.tsx:50-120]()

## Plan System Integration

The Agent System integrates with a Plan System that breaks down complex tasks into executable steps:

```mermaid
graph LR
    Task[User Task] --> Plan[Generated Plan]
    Plan --> Step1[Step 1]
    Plan --> Step2[Step 2]
    Plan --> Step3[Step 3]
    
    Step1 --> Execute1[Execute]
    Step2 --> Execute2[Execute]
    Step3 --> Execute3[Execute]
    
    Execute1 --> Result1[Result]
    Execute2 --> Result2[Result]
    Execute3 --> Result3[Result]
    
    Result1 --> Aggregate[Aggregate Results]
    Result2 --> Aggregate
    Result3 --> Aggregate
```

### Plan Components

| Component | File | Purpose |
|-----------|------|---------|
| `PlanCard` | `PlanCard.tsx` | Displays individual plan summary |
| `PlanList` | `PlanList.tsx` | Lists all available plans |
| `PlanView` | `PlanView.tsx` | Interactive plan editor/viewer |

资料来源：[frontend/src/components/features/Plans/PlanCard.tsx:1-100]()

## Message Rendering System

The Agent System communicates results through a structured message rendering system:

```mermaid
graph TD
    Message[Agent Message] --> Parse[Parse Content]
    Parse --> Type{Message Type}
    
    Type -->|Text| TextRender[Text Renderer]
    Type -->|Plan| PlanRender[Plan View]
    Type -->|File| FileRender[File Renderer]
    Type -->|Image| ImageRender[Image Renderer]
    
    TextRender --> Display[UI Display]
    PlanRender --> Display
    FileRender --> Display
    ImageRender --> Display
```

The `RenderMessage` component handles the display of agent outputs, supporting:

- Multi-modal content rendering
- Plan visualization
- File previews and downloads
- Image galleries

资料来源：[frontend/src/components/views/chat/rendermessage.tsx:1-100]()

## Browser Control Details

The Web Surfer Agent uses coordinate-based browser control:

### Coordinate System

- **X-axis**: Pixels from the left edge of the viewport
- **Y-axis**: Pixels from the top edge of the viewport
- **Scroll**: Positive values scroll up, negative values scroll down

### Type Action Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `text` | string | Text to type |
| `coordinate` | [x, y] | Target element position |
| `press_enter` | boolean | Submit after typing |
| `delete_existing_text` | boolean | Clear before typing |

## Run Status States

The agent execution maintains the following status states:

| Status | Description |
|--------|-------------|
| `active` | Agent is currently executing |
| `paused` | Execution paused, awaiting resume |
| `pausing` | Pause is in progress |
| `awaiting_input` | Waiting for user input or approval |
| `completed` | Task finished successfully |
| `failed` | Task execution failed |

资料来源：[frontend/src/components/views/chat/chat.tsx:30-60]()

## Configuration Management

The Agent System configuration is managed through the frontend store:

```typescript
interface IAgentFlowSettings {
  direction: "TB" | "LR";  // Flow chart orientation
  showLabels: boolean;      // Display edge labels
  showGrid: boolean;        // Show background grid
  showTokens: boolean;     // Display token counts
  showMessages: boolean;   // Show message nodes
  showMiniMap: boolean;    // Show navigation minimap
}
```

资料来源：[frontend/src/hooks/store.tsx:1-80]()

## MCP Server Integration

The system supports Model Context Protocol (MCP) servers for extended agent capabilities:

- SSE-based server connections
- Stdio-based server connections
- JSON configuration import

MCP servers are configured via the `McpConfigModal` component and integrated into the agent selection process during task execution.

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-100]()

## Detail Viewer

The `DetailViewer` component provides real-time visualization of agent activities:

- **Screenshots Tab**: Periodic screenshots of browser state
- **Live Tab**: Real-time browser view via noVNC
- **Control Mode**: Fullscreen overlay for manual intervention

资料来源：[frontend/src/components/views/chat/detail_viewer.tsx:1-100]()
[frontend/src/components/views/chat/runview.tsx:1-80]()

## Summary

The Agent System in Magentic-UI provides a comprehensive framework for autonomous task execution through:

- **Specialized Agents**: Web Surfer, File Surfer, Coder, User Proxy
- **Orchestration Layer**: Coordinates multi-agent collaboration
- **Plan System**: Breaks tasks into executable steps
- **Real-time Visualization**: Browser screenshots and live view
- **Human-in-the-Loop**: User proxy for approval and input
- **MCP Integration**: Extensible server architecture

This architecture enables complex, multi-step task automation while maintaining user oversight and control throughout the execution process.

---

<a id='team-orchestration'></a>

## Team Orchestration

### 相关页面

相关主题：[Agent System](#agent-system), [High-Level Architecture](#high-level-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/magentic_ui/_cli.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_cli.py)
- [src/magentic_ui/agents/_coder.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/_coder.py)
- [src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)
- [src/magentic_ui/agents/web_surfer/fara/qwen_helpers/fncall_prompt.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/qwen_helpers/fncall_prompt.py)
- [frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)
</details>

# Team Orchestration

## Overview

Team Orchestration is a core system in Magentic-UI that coordinates multiple AI agents to work together on complex tasks. It provides the infrastructure for orchestrating agent teams, managing communication between agents, and handling task distribution and execution flow.

The orchestration system enables Magentic-UI to:

- Coordinate multiple specialized agents (coders, web surfers, planners)
- Manage agent collaboration through structured message passing
- Handle approval policies for sensitive actions
- Support dynamic task execution with planning and reflection capabilities

## Architecture

### Core Components

| Component | File Path | Purpose |
|-----------|-----------|---------|
| Orchestrator | `src/magentic_ui/teams/orchestrator/_orchestrator.py` | Central coordinator for agent teams |
| Group Chat | `src/magentic_ui/teams/orchestrator/_group_chat.py` | Manages multi-agent message passing |
| Prompts | `src/magentic_ui/teams/orchestrator/_prompts.py` | Prompt templates for orchestration |
| Sentinel Prompts | `src/magentic_ui/teams/orchestrator/_sentinel_prompts.py` | Safety and monitoring prompts |

### Agent Types

Magentic-UI supports several specialized agent types that can be orchestrated:

| Agent Type | Purpose | Key Actions |
|------------|---------|-------------|
| Coder Agent | Execute Python/code tasks | Write, debug, execute code |
| Web Surfer Agent | Browse and interact with web content | scroll, visit_url, web_search, wait |
| Planner Agent | Create and manage execution plans | Task decomposition, step planning |
| MCP Server Agents | External tool integrations | Configurable via SSE/stdio protocols |

## Web Surfer Agent Actions

The web surfer agent supports a comprehensive set of actions for web interaction:

```python
parameters = {
    "properties": {
        "scroll": {
            "description": "The number of pixels to scroll. Positive scrolls down, negative scrolls up.",
            "type": "number",
        },
        "url": {
            "description": "The URL to visit. Required only by `action=visit_url`.",
            "type": "string",
        },
        "query": {
            "description": "The query to search for. Required only by `action=web_search`.",
            "type": "string",
        },
        "fact": {
            "description": "The fact to remember for the future. Required only by `action=pause_and_memorize_fact`.",
            "type": "string",
        },
        "time": {
            "description": "The seconds to wait. Required only by `action=wait`.",
            "type": "number",
        },
        "status": {
            "description": "The status of the task. Required only by `action=terminate`.",
            "type": "string",
            "enum": ["success", "failure"],
        },
    },
    "required": ["action"],
    "type": "object",
}
```

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-50](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)

### Action Types

| Action | Parameters | Description |
|--------|------------|-------------|
| `scroll` | `scroll` (pixels) | Scrolls the viewport |
| `visit_url` | `url` | Navigate to a URL |
| `web_search` | `query` | Search the web |
| `pause_and_memorize_fact` | `fact` | Store information for context |
| `wait` | `time` (seconds) | Wait before continuing |
| `terminate` | `status` (success/failure) | End the task |

## Function Call Handling

The system uses a specialized function call prompt system for agent communication:

```python
tool_descs = [{"type": "function", "function": f} for f in functions]
tool_names = [
    function.get("name_for_model", function.get("name", ""))
    for function in functions
]
tool_descs = "\n".join([json.dumps(f, ensure_ascii=False) for f in tool_descs])
```

资料来源：[src/magentic_ui/agents/web_surfer/fara/qwen_helpers/fncall_prompt.py:1-60](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/qwen_helpers/fncall_prompt.py)

### Message Processing Flow

```
graph TD
    A[User Input] --> B[Orchestrator]
    B --> C{Agent Selection}
    C -->|Planning| D[Planner Agent]
    C -->|Execution| E[Coder Agent]
    C -->|Web Tasks| F[Web Surfer Agent]
    D --> G[Execution Plan]
    E --> H[Code Execution]
    F --> I[Web Actions]
    G --> B
    H --> B
    I --> B
    B --> J[User Response]
```

### Role-Based Message Handling

| Role | Processing | Content Format |
|------|------------|----------------|
| `ASSISTANT` | Appends tool calls to last message | `<tool_call>` XML blocks |
| `FUNCTION` | Processes tool responses | `<tool_response>` XML blocks |
| `USER` | Standard user messages | Plain text or structured |

## MCP Server Integration

Magentic-UI supports Model Context Protocol (MCP) servers for extended functionality:

```typescript
interface McpServerConfig {
  serverName: string;
  agentName: string;
  agentDescription: string;
  connectionType: 'sse' | 'stdio' | 'json';
}
```

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)

### MCP Configuration Modes

| Mode | Description | Use Case |
|------|-------------|----------|
| SSE | Server-Sent Events | Remote server connections |
| Stdio | Standard I/O | Local process communication |
| JSON Config | Raw JSON configuration | Advanced users |

### Agent Description Requirements

Each MCP server requires a description that helps the orchestrator decide when to invoke it:

> "Describe how and when this server should be used. This helps the orchestrator decide when to call this agent."

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-120](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)

## Code Execution Flow

The coder agent provides secure code execution with output capture:

```python
async def _summarize_coding(
    agent_name: str,
    model_client: ChatCompletionClient,
    thread: Sequence[BaseChatMessage | BaseAgentEvent],
    cancellation_token: CancellationToken,
    model_context: ChatCompletionContext,
) -> TextMessage:
```

资料来源：[src/magentic_ui/agents/_coder.py:1-100](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/_coder.py)

### Code Execution States

| State | Description | Exit Code |
|-------|-------------|-----------|
| Success | Code executed without errors | `0` |
| Timeout | Execution exceeded time limit | N/A |
| Error | Runtime exception occurred | Non-zero |

```python
# Break if all code executions were successful
if all([code_output == 0 for code_output in exit_code_list]):
    break
```

## CLI Configuration

The orchestration system is configured through the CLI entry point:

```python
def main() -> None:
    """
    Entry point for the magentic-cli command.
    Called from pyproject.toml's [project.scripts] section.
    """
    app()
```

资料来源：[src/magentic_ui/_cli.py:1-50](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_cli.py)

### CLI Parameters

| Parameter | Type | Purpose |
|-----------|------|---------|
| `mcp_agents` | List | External MCP server agents |
| `run_without_docker` | bool | Run without container isolation |
| `browser_headless` | bool | Run browser in headless mode |
| `browser_local` | bool | Use local browser instead of remote |
| `sentinel_tasks` | List | Background monitoring tasks |
| `dynamic_sentinel_sleep` | int | Sleep interval for sentinel checks |

## Approval Policies

The orchestration system supports configurable approval policies for controlling agent actions:

| Policy | Behavior |
|--------|----------|
| `Auto Approve` | All actions execute automatically |
| `Manual Approval` | User must approve each action |
| `Policy Based` | Rules determine approval based on action type |

## Error Handling

### Timeout Handling

```python
except asyncio.TimeoutError:
    executor_msg = TextMessage(
        source=agent_name + "-executor",
        metadata={"internal": "yes"},
        content="Code execution timed out.",
    )
    delta.append(executor_msg)
    yield executor_msg
```

## Summary

Team Orchestration in Magentic-UI provides a flexible framework for coordinating multiple AI agents. Key features include:

1. **Multi-Agent Coordination**: Specialized agents work together through the orchestrator
2. **Flexible Communication**: Role-based message passing with XML-formatted tool calls
3. **MCP Integration**: Extensible architecture through Model Context Protocol servers
4. **Safe Execution**: Code execution with timeout handling and error capture
5. **Approval Workflows**: Configurable policies for sensitive operations

The system is designed to be modular, allowing new agent types and capabilities to be added through well-defined interfaces.

---

<a id='backend-api'></a>

## Backend API

### 相关页面

相关主题：[High-Level Architecture](#high-level-architecture), [Frontend UI](#frontend-ui)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/magentic_ui/backend/web/app.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/app.py)
- [frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)
- [src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)
</details>

# Backend API

## Overview

The Magentic-UI Backend API is a FastAPI-based REST/WebSocket service that orchestrates multi-agent workflows, manages conversation sessions, and provides real-time execution capabilities for the frontend interface. It serves as the central hub for all backend operations including session management, run execution, agent coordination, and MCP (Model Context Protocol) integration.

The API is accessible at `http://localhost:8081/api` for local development and expects all frontend requests to be directed to this endpoint. 资料来源：[frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

## Architecture

### High-Level Architecture

```mermaid
graph TB
    subgraph "Frontend Client"
        UI[UI Components]
    end
    
    subgraph "Backend API - FastAPI"
        App[Main Application]
        Routers[Route Handlers]
        Managers[Connection Managers]
    end
    
    subgraph "Data Layer"
        DB[(Database)]
        StaticFiles[Static Files]
    end
    
    UI --> |HTTP/WS| App
    App --> Routers
    Routers --> Managers
    Managers --> DB
    App --> StaticFiles
```

### API Router Structure

The backend organizes its functionality into modular routers, each handling a specific domain:

| Router | Prefix | Purpose |
|--------|--------|---------|
| Teams Router | `/teams` | Multi-agent team coordination |
| WebSocket Router | `/ws` | Real-time bidirectional communication |
| Validation Router | `/validate` | Input validation endpoints |
| Settings Router | `/settings` | User and system configuration |
| MCP Router | `/mcp` | Model Context Protocol integration |
| Sessions Router | `/sessions` | Conversation session management |
| Runs Router | `/runs` | Execution run tracking and control |

资料来源：[src/magentic_ui/backend/web/app.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/app.py)

## Core Endpoints

### Health Check

```
GET /api/health
```

Returns the health status of the API service.

**Response:**
```json
{
  "status": true,
  "message": "Service is healthy"
}
```

### Version Information

```
GET /api/version
```

Retrieves the current API version.

**Response:**
```json
{
  "status": true,
  "message": "Version retrieved successfully",
  "data": {
    "version": "<VERSION_STRING>"
  }
}
```

资料来源：[src/magentic_ui/backend/web/app.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/app.py)

## API Configuration

### Environment Variables

The frontend must be configured with the correct API URL. Create a `.env.development` file based on `.env.default`:

```bash
cp .env.default .env.development
```

The primary configuration variable is `GATSBY_API_URL` which should be set to `http://localhost:8081/api` for local development. 资料来源：[frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

### Static File Serving

The backend mounts two static file directories:

| Mount Path | Directory | Purpose |
|------------|-----------|---------|
| `/files` | `static_root` | File downloads with HTML fallback |
| `/` | `ui_root` | Frontend UI assets |

```python
app.mount(
    "/files",
    StaticFiles(directory=initializer.static_root, html=True),
    name="files",
)
app.mount("/", StaticFiles(directory=initializer.ui_root, html=True), name="ui")
```

资料来源：[src/magentic_ui/backend/web/app.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/app.py)

## Error Handling

### Internal Server Error Handler

The API includes a global exception handler for 500 errors:

```python
@app.exception_handler(500)
async def internal_error_handler(request: Request, exc: Exception):
    logger.error(f"Internal error: {str(exc)}")
    return {
        "status": False,
        "message": "Internal server error",
        "detail": str(exc) if settings.debug else None
    }
```

This handler:
- Logs the full error details server-side
- Returns sanitized error messages to clients (hiding details in production)
- Uses `settings.debug` to control error visibility 资料来源：[src/magentic_ui/backend/web/app.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/app.py)

## WebSocket Communication

The WebSocket router (`/ws`) enables real-time bidirectional communication between the frontend and backend. This is essential for:

- Live agent execution progress updates
- Streaming intermediate results
- Real-time user input responses during agent runs
- Session state synchronization

## Agent Actions and Tool Integration

The backend exposes agent capabilities through structured action parameters. Agents support the following action types:

| Action | Description | Required Parameters |
|--------|-------------|---------------------|
| `key` | Perform keyboard key presses | `keys` (array) |
| `type` | Type text into input fields | `text`, `press_enter`, `delete_existing_text` |
| `mouse_move` | Move cursor to coordinates | `coordinate` [x, y] |
| `left_click` | Click at coordinates | `coordinate` [x, y] |
| `scroll` | Scroll mouse wheel | `pixels` |
| `visit_url` | Navigate to URL | `url` |
| `web_search` | Execute web search | `query` |
| `history_back` | Go to previous page | - |
| `pause_and_memorize_fact` | Store information | `fact` |
| `wait` | Pause execution | `time` (seconds) |
| `terminate` | End task | `status` (success/failure) |

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)

## Running the Backend

### Prerequisites

Ensure all prerequisites are installed before running the backend. The system requires:

- Python environment with dependencies installed
- Node.js and npm for frontend development (if building from source)
- nvm for Node version management

### Starting the Server

```bash
magentic-ui --port 8081
```

The server will:
1. Initialize the FastAPI application
2. Mount all routers under `/api` prefix
3. Establish database connections
4. Start listening on the specified port

### Development Mode

For frontend development with hot-reloading:

1. Start frontend in development mode:
```bash
cd frontend
npm run start
```

2. Run the backend:
```bash
magentic-ui --port 8081
```

The frontend development server runs at `http://localhost:8000`, while the compiled frontend is available at `http://localhost:8081`. 资料来源：[frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

## API Response Format

All API responses follow a consistent format:

```json
{
  "status": true | false,
  "message": "Human-readable status message",
  "data": { ... } | null,
  "detail": "Error details (optional, debug mode only)"
}
```

This standardization allows the frontend to handle all responses uniformly regardless of which router handled the request.

## Related Documentation

For troubleshooting and setup issues, refer to the [TROUBLESHOOTING.md](TROUBLESHOOTING.md) file in the repository root.

---

<a id='frontend-ui'></a>

## Frontend UI

### 相关页面

相关主题：[Backend API](#backend-api), [High-Level Architecture](#high-level-architecture)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [frontend/src/components/views/chat/chat.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chat.tsx)
- [frontend/src/components/views/manager.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/manager.tsx)
- [frontend/src/components/layout.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/layout.tsx)
- [frontend/src/components/features/McpServersConfig/McpServerCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpServerCard.tsx)
- [frontend/src/components/common/filerenderer.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/filerenderer.tsx)
- [frontend/src/components/views/chat/rendermessage.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/rendermessage.tsx)
- [frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)
- [frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)
- [frontend/src/components/features/Plans/PlanList.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanList.tsx)
- [frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)
- [frontend/src/components/common/markdownrender.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/markdownrender.tsx)
- [frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)
</details>

# Frontend UI

## Overview

The Frontend UI of magentic-ui is a React-based web application that provides an interactive interface for users to interact with AI agents. The frontend communicates with the backend API at `http://localhost:8081/api` and enables features such as chat conversations, plan management, MCP server configuration, and real-time task execution visualization.

The UI is built using:
- **React** with TypeScript for component architecture
- **Ant Design** as the primary UI component library
- **Tailwind CSS** for custom styling
- **React Markdown** for rendering markdown content

资料来源：[frontend/package.json](https://github.com/microsoft/magentic-ui/blob/main/frontend/package.json)

## Architecture Overview

The frontend application follows a component-based architecture with clear separation between layout, views, features, and common components.

```mermaid
graph TD
    A[App Entry] --> B[MagenticUILayout]
    B --> C[SessionManager]
    C --> D[Views]
    C --> E[SubMenus]
    D --> F[ChatView]
    D --> G[RunView]
    D --> H[PlanList]
    E --> I[SessionList]
    E --> J[PlanLibrary]
    F --> K[ChatInput]
    F --> L[RenderMessage]
    F --> M[ProgressBar]
    F --> N[DetailViewer]
    G --> K
    G --> L
    G --> M
    G --> N
```

资料来源：[frontend/src/components/layout.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/layout.tsx)

## Application Layout

### MagenticUILayout

The main layout wrapper component that provides theme configuration and session management context to all child components.

| Prop | Type | Description |
|------|------|-------------|
| restricted | boolean | Whether to restrict access to authenticated users only |
| children | ReactNode | Child components to render within the layout |

Key responsibilities:
- Applies theme algorithms (dark/light mode) via Ant Design's `ConfigProvider`
- Wraps content in `AppContext` for global state access
- Displays a disclaimer footer: "Magentic-UI can make mistakes. Please monitor its work and intervene if necessary."

资料来源：[frontend/src/components/layout.tsx:1-100](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/layout.tsx)

### SessionManager

The central orchestrator component that manages the overall application state including sessions, plans, and navigation between different views.

```mermaid
graph LR
    A[SessionManager] --> B[PlanList]
    A --> C[ChatView]
    A --> D[SessionEditor]
    B --> E[PlanCard]
    C --> F[ChatInput]
    C --> G[MessageList]
    C --> H[RunView]
```

State management includes:
- `activeSubMenuItem`: Current navigation state
- `sessions`: List of user sessions
- `currentRun`: Active task execution state
- `selectedMcpServers`: Selected MCP server configurations
- `editingSession`: Session being edited

资料来源：[frontend/src/components/views/manager.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/manager.tsx)

## Chat System

### ChatView

The primary chat interface where users interact with AI agents through messages and file uploads.

```mermaid
sequenceDiagram
    User->>ChatInput: Enter message
    ChatInput->>ChatView: handleSubmit(query, files, plan)
    ChatView->>Backend: runTask() via API
    Backend-->>ChatView: CurrentRun status
    ChatView->>RunView: Pass run data
    RunView->>MessageList: Display messages
    MessageList->>RenderMessage: Render each message
```

**Key Features:**
- Message submission with text, files, and attached plans
- Real-time run status display (running, paused, awaiting_input, completed)
- MCP server selection for task execution
- Plan execution control (approve, deny, pause, cancel)
- Sample tasks for quick start

资料来源：[frontend/src/components/views/chat/chat.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chat.tsx)

### ChatInput

A rich text input component supporting multi-line input, file attachments, and plan attachments.

**Props:**
| Prop | Type | Description |
|------|------|-------------|
| onSubmit | Function | Callback when message is submitted |
| onCancel | Function | Callback to cancel current operation |
| runStatus | string | Current run status |
| inputRequest | object | Request for user input |
| isPlanMessage | boolean | Whether input is for plan response |
| onPause | Function | Callback to pause execution |
| onExecutePlan | Function | Callback to execute a plan |
| enable_upload | boolean | Enable file uploads |
| selectedMcpServers | array | Selected MCP servers |

**Features:**
- File drag-and-drop and paste support
- File list display with remove capability
- Plan attachment modal for viewing attached plans
- Auto-resizing textarea
- Submit button with loading state

资料来源：[frontend/src/components/views/chat/chatinput.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/chatinput.tsx)

### RenderMessage

Component responsible for rendering different types of messages including user messages, AI responses, plans, and file previews.

**Rendering Logic:**
1. Checks if message is from user or assistant
2. Parses content using `parseContent` utility
3. Handles multi-modal content (text arrays)
4. Renders plans via `PlanView` component
5. Applies appropriate styling based on message type

```typescript
// Message type detection based on metadata
if (message?.metadata?.type === "file" && message?.metadata?.files) {
  // File message handling
  const parsedFiles = JSON.parse(message.metadata.files);
}
```

资料来源：[frontend/src/components/views/chat/rendermessage.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/rendermessage.tsx)

### ProgressBar

Visual indicator for task execution progress with step-by-step status display.

**Display States:**
- **Task Completed**: Green progress bar at 100% with "Task Completed" text
- **In Progress**: Shows current step (e.g., "Step 2 of 5") with progress bar
- **Planning**: Shows "Planning..." text when plan is being generated

**Progress Calculation:**
```javascript
// Completed section width
width: hasFinalAnswer ? "100%" : (currentStep / totalSteps) * 100 + "%"

// Current step indicator position
left: (currentStep / totalSteps) * 100 + "%"
width: (1 / totalSteps) * 100 + "%"
```

资料来源：[frontend/src/components/views/chat/progressbar.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/progressbar.tsx)

### RunView

Container component that manages the detail viewer and message display during task execution.

**Layout:**
- Split view with message list on the left
- Detail viewer on the right (collapsible/expandable)
- Manages image gallery, VNC preview, and input responses

资料来源：[frontend/src/components/views/chat/runview.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/runview.tsx)

## Plan Management

### PlanList

Displays the user's saved plans library with search, create, import, and management capabilities.

**Features:**
| Feature | Description |
|---------|-------------|
| Create | Create a new empty plan |
| Import | Import plan from JSON file |
| Search | Filter plans by name |
| Export | Download plan as JSON |
| Delete | Remove plan from library |

**Plan Card Actions:**
- **Run Plan**: Create new session with plan loaded
- **Edit Plan**: Modify plan title and steps in modal

资料来源：[frontend/src/components/features/Plans/PlanList.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanList.tsx)

### PlanCard

Individual plan display card showing plan metadata and quick actions.

**Displayed Information:**
- Plan title (truncated to 40 characters)
- Step count summary (showing first 3 steps)
- Creation timestamp with relative time formatting
- Hover actions for export and delete

**Modal Editing:**
- Editable plan title
- Full plan step editor via `PlanView` component
- Save and cancel functionality

资料来源：[frontend/src/components/features/Plans/PlanCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/PlanCard.tsx)

### LearnPlanButton

Button component that allows users to extract and save a reusable plan from the current conversation.

**States:**
| State | Appearance |
|-------|------------|
| Disabled | Opacity 50%, cursor not-allowed |
| Learning | Spinner with "Learning Plan..." text |
| Ready | Normal button with "Learn Plan" label |

**Behavior:**
- Disabled when no `sessionId` or `effectiveUserId`
- Triggers plan extraction from conversation history
- Saves extracted plan to user's plan library

资料来源：[frontend/src/components/features/Plans/LearnPlanButton.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/Plans/LearnPlanButton.tsx)

## File Handling

### RenderFile

Component for displaying and managing file attachments in messages.

**Features:**
- Detects file type from metadata
- Renders appropriate preview based on file type
- Provides download functionality
- Supports modal view for detailed file inspection

**File Type Detection:**
```typescript
if (message?.metadata?.type === "file" && message?.metadata?.files) {
  const parsedFiles = JSON.parse(message.metadata.files);
  // Process files to ensure correct type detection
}
```

### FileCard

Displays individual file with icon, name, and download button.

**Interactions:**
- Click to open file in modal
- Hover to show download button
- Drag-and-drop zone support

资料来源：[frontend/src/components/common/filerenderer.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/filerenderer.tsx)

## Markdown Rendering

### MarkdownRender

Component for rendering markdown content with syntax highlighting and GitHub Flavored Markdown support.

**Features:**
- Syntax highlighting via language detection from file extensions
- Configurable text truncation
- Indentation indicator support
- Dark/light mode compatible styling

**Configuration:**
| Option | Type | Description |
|--------|------|-------------|
| truncate | boolean | Enable content truncation |
| maxLength | number | Maximum character length |
| indented | boolean | Show indentation indicator |
| isFilePreview | boolean | Wrap in code block |

资料来源：[frontend/src/components/common/markdownrender.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/common/markdownrender.tsx)

## MCP Server Configuration

### McpServerCard

Card component for displaying MCP (Model Context Protocol) server configurations.

**Displayed Information:**
- Server name
- Agent description (truncated to 2 lines)
- Availability status

**Actions:**
| Action | Description |
|--------|-------------|
| Edit | Modify server configuration |
| Remove | Delete server from configuration |

资料来源：[frontend/src/components/features/McpServersConfig/McpServerCard.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpServerCard.tsx)

## Relevant Plans

### RelevantPlans

Component for displaying plans relevant to the current conversation context.

**Features:**
- Shows top 3 most relevant plans based on current query
- Plan attachment to query
- Play action to attach and run plan

资料来源：[frontend/src/components/views/chat/relevant_plans.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/relevant_plans.tsx)

## State Management

The frontend uses React Context for global state management:

```mermaid
graph TD
    A[AppContext] --> B[User State]
    A --> C[Session State]
    A --> D[Theme State]
    A --> E[Provider State]
    
    B --> F[userId]
    B --> G[username]
    
    C --> H[sessions]
    C --> I[currentRun]
    C --> J[plans]
    
    E --> K[mcpServers]
    E --> L[selectedMcpServers]
```

### Provider Hook

Custom hooks for accessing and manipulating application state:

| Hook | Purpose |
|------|---------|
| useAppContext | Access global app context |
| useSessions | Manage session list and operations |
| usePlans | Manage saved plans |
| useMcpServers | Manage MCP server configurations |

资料来源：[frontend/src/hooks/provider.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/hooks/provider.tsx)

## API Integration

The frontend communicates with the backend API at `http://localhost:8081/api`.

**Environment Configuration:**
```bash
# In .env.development
GATSBY_API_URL=http://localhost:8081/api
```

资料来源：[frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

### Key API Operations

| Operation | Description |
|-----------|-------------|
| runTask | Start a new task execution |
| handleInputResponse | Respond to input requests |
| handlePause | Pause current execution |
| handleCancel | Cancel running task |
| handleApprove | Approve pending action |
| handleDeny | Deny pending action |
| handleAcceptPlan | Accept proposed plan |
| handleRegeneratePlan | Regenerate plan suggestions |

## Component Hierarchy Summary

```mermaid
graph TD
    Root[MagenticUILayout] --> SessionManager
    SessionManager --> Header
    SessionManager --> Sidebar
    SessionManager --> Content
    
    Sidebar --> PlanList
    Sidebar --> SessionList
    
    Content --> ChatView
    Content --> RunView
    
    ChatView --> ChatInput
    ChatView --> MessageList
    ChatView --> RelevantPlans
    
    MessageList --> RenderMessage
    RenderMessage --> PlanView
    RenderMessage --> RenderFile
    RenderMessage --> MarkdownRender
    
    ChatInput --> FileList
    ChatInput --> PlanModal
    
    RunView --> DetailViewer
    RunView --> ProgressBar
```

## Development Guidelines

### Adding New Routes

To add a new route (e.g., `/about`):
1. Create folder `src/pages/about`
2. Add `index.tsx` file
3. Follow content style from `src/pages/index.tsx`
4. Place core logic in `src/components` folder

### Key Dependencies

| Package | Version | Purpose |
|---------|---------|---------|
| react | ^18.x | UI framework |
| antd | ^5.x | Component library |
| @ant-design/icons | ^5.x | Icon library |
| react-markdown | ^9.x | Markdown rendering |
| remark-gfm | ^4.x | GitHub Flavored Markdown |

资料来源：[frontend/package.json](https://github.com/microsoft/magentic-ui/blob/main/frontend/package.json)

---

<a id='browser-automation'></a>

## Browser Automation

### 相关页面

相关主题：[Agent System](#agent-system), [Docker Containers](#docker-containers)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/magentic_ui/tools/playwright/browser/headless_docker_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/headless_docker_playwright_browser.py)
- [src/magentic_ui/tools/playwright/browser/vnc_docker_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/vnc_docker_playwright_browser.py)
- [src/magentic_ui/tools/playwright/browser/local_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/local_playwright_browser.py)
- [src/magentic_ui/tools/playwright/playwright_controller.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/playwright_controller.py)
- [src/magentic_ui/tools/playwright/playwright_state.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/playwright_state.py)
- [src/magentic_ui/tools/playwright/_set_of_mark.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/_set_of_mark.py)
- [src/magentic_ui/agents/web_surfer/_web_surfer.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/_web_surfer.py)
</details>

# Browser Automation

## Overview

The Browser Automation system in Magentic-UI provides a comprehensive framework for controlling web browsers through programmatic interactions. Built on top of Playwright, this system enables AI agents to navigate websites, interact with UI elements, extract content, and perform complex browsing tasks autonomously.

The system supports multiple browser deployment modes including local execution, headless Docker containers, and VNC-enabled Docker containers for visual debugging. This flexibility allows the system to operate in various environments from development machines to cloud deployments.

## Architecture Overview

```mermaid
graph TD
    A[WebSurfer Agent] --> B[PlaywrightController]
    B --> C[Browser Implementations]
    C --> D[LocalPlaywrightBrowser]
    C --> E[HeadlessDockerPlaywrightBrowser]
    C --> F[VncDockerPlaywrightBrowser]
    G[PlaywrightState] --> B
    H[SetOfMark] --> B
    I[Playwright API] --> C
```

## Browser Implementations

The system implements a base `PlaywrightBrowser` class with three specialized implementations:

### Base Architecture

All browser implementations inherit from `PlaywrightBrowser` which provides the core interface for browser operations. This design pattern allows for consistent API usage across different deployment scenarios.

### LocalPlaywrightBrowser

The `LocalPlaywrightBrowser` provides direct browser control on the local machine. This implementation offers:

- Full browser lifecycle management (launch, close, context management)
- Synchronous and asynchronous operation support
- Download folder configuration
- Viewport customization
- Screenshot capture capabilities

**Key Features:**
- Direct Playwright API access without containerization overhead
- Ideal for development and testing environments
- Supports all Playwright browser types (Chromium, Firefox, WebKit)

资料来源：[src/magentic_ui/tools/playwright/browser/local_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/local_playwright_browser.py)

### HeadlessDockerPlaywrightBrowser

The `HeadlessDockerPlaywrightBrowser` runs browsers inside headless Docker containers. This approach provides:

- Isolated browser execution environment
- Consistent behavior across different host systems
- No visual rendering overhead
- Enhanced security through containerization

**Docker Integration:**
- Automatic container image pulling on first use
- Graceful container lifecycle management
- Resource-efficient headless operation

资料来源：[src/magentic_ui/tools/playwright/browser/headless_docker_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/headless_docker_playwright_browser.py)

### VncDockerPlaywrightBrowser

The `VncDockerPlaywrightBrowser` extends headless Docker support with VNC connectivity, enabling:

- Real-time visual browser observation
- Interactive debugging capabilities
- NoVNC support for browser-based VNC access
- Remote control handover to human operators

**Port Configuration:**

| Parameter | Description | Default |
|-----------|-------------|---------|
| `port` | Main VNC port for container communication | 5900 |
| `novnc_port` | WebSocket port for noVNC browser access | 6080 |

资料来源：[src/magentic_ui/tools/playwright/browser/vnc_docker_playwright_browser.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/browser/vnc_docker_playwright_browser.py)

## PlaywrightController

The `PlaywrightController` serves as the central orchestrator for browser interactions. It abstracts the complexities of browser automation into a clean, agent-friendly interface.

### Core Responsibilities

**Page Navigation and Content Extraction:**
- Visit URLs with configurable timeouts
- Extract visible text content from pages
- Capture full-page or viewport screenshots
- Analyze DOM structure for interactive elements

**User Interaction Simulation:**
- Mouse movements to specific coordinates
- Left-click and hover actions
- Text input via keyboard typing
- Keyboard shortcuts and key presses
- Scroll operations with configurable pixels

**Tab Management:**
- Create new browser tabs
- Switch between existing tabs
- Close tabs
- Refresh page content

### Action Schema

The controller defines a structured JSON schema for all available actions:

| Action | Parameters | Description |
|--------|------------|-------------|
| `visit_url` | `url` | Navigate to specified URL |
| `web_search` | `query` | Execute web search |
| `type` | `text`, `coordinate`, `press_enter`, `delete_existing_text` | Type text or interact with elements |
| `key` | `keys` | Press keyboard keys |
| `mouse_move` | `coordinate` | Move mouse cursor |
| `left_click` | `coordinate` | Click at coordinate |
| `hover` | `coordinate` | Hover over element |
| `scroll` | `pixels` | Scroll page (positive=up, negative=down) |
| `select_option` | `element`, `value` | Select dropdown option |
| `create_tab` | `url` | Open new tab |
| `switch_tab` | `tab_id` | Switch to specific tab |
| `refresh_page` | - | Reload current page |
| `history_back` | - | Navigate browser history back |
| `sleep` | `time` | Wait specified seconds |
| `stop_action` | - | Stop current action sequence |

资料来源：[src/magentic_ui/tools/playwright/playwright_controller.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/playwright_controller.py)

## State Management

### PlaywrightState

The `PlaywrightState` module handles serialization and persistence of browser session state.

**State Components:**

| Component | Description |
|-----------|-------------|
| `BrowserState` | Complete snapshot of browser context |
| `save_browser_state()` | Serialize current state to storage |
| `load_browser_state()` | Restore state from storage |

**State Data Structure:**
- Current page URL and title
- Tab information and active tab ID
- Scroll position
- Cookie and local storage data
- Form input values
- Screenshot history

This enables:
- Session recovery after interruptions
- Parallel agent execution with shared state
- Checkpoint creation for long-running tasks

资料来源：[src/magentic_ui/tools/playwright/playwright_state.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/playwright_state.py)

## Interactive Element Marking

### Set of Mark (_set_of_mark)

The `_set_of_mark` module enhances web pages with visual markers that identify interactive elements. This is crucial for LLM-based agents to accurately identify and target UI elements.

**Marking Strategy:**
- Assigns unique numeric identifiers to interactive elements
- Overlays clickable numbers on buttons, links, inputs
- Uses sequential numbering for easy reference
- Provides coordinate mappings for action targeting

**Benefits:**
- Enables precise element targeting by AI agents
- Reduces ambiguity in element selection
- Supports visual debugging and verification
- Works across different page layouts and frameworks

资料来源：[src/magentic_ui/tools/playwright/_set_of_mark.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/tools/playwright/_set_of_mark.py)

## WebSurfer Agent

The `WebSurfer` agent is the primary consumer of the browser automation system. It combines the browser implementations with an LLM to make intelligent browsing decisions.

### Agent Capabilities

**Autonomous Navigation:**
- Follow links and navigate between pages
- Complete multi-step web forms
- Search the web and process results
- Extract structured information from pages

**Content Processing:**
- Optical Character Recognition (OCR) for image content
- Visual question answering on screenshots
- Markdown rendering of page content
- File download handling

**Interaction Modes:**
- Automatic execution with configurable action limits
- Step-by-step mode with human approval
- Control handover for human intervention
- Pause and resume capabilities

### Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `start_page` | str | Google | Initial page on browser launch |
| `animate_actions` | bool | False | Enable action animation |
| `save_screenshots` | bool | False | Persist screenshots to disk |
| `max_actions_per_step` | int | 5 | Maximum actions per reasoning step |
| `resize_viewport` | bool | True | Auto-resize viewport |
| `url_statuses` | dict | None | URL allow/reject rules |
| `single_tab_mode` | bool | False | Restrict to single tab |

资料来源：[src/magentic_ui/agents/web_surfer/_web_surfer.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/_web_surfer.py)

## Usage Patterns

### Local Browser Usage

```python
from magentic_ui.tools.playwright.browser import LocalPlaywrightBrowser

browser = LocalPlaywrightBrowser(
    headless=False,
    downloads_folder="./downloads"
)
await browser.start()
```

### Docker-based Browser with VNC

```python
from magentic_ui.tools.playwright.browser import VncDockerPlaywrightBrowser

browser = VncDockerPlaywrightBrowser(
    port=5900,
    novnc_port=8080
)
await browser.start()
# Access via browser at http://localhost:8080/vnc.html
```

### Controller Integration

```python
from magentic_ui.tools.playwright.playwright_controller import PlaywrightController

controller = PlaywrightController(browser)
await controller.async_setup()

# Execute actions
result = await controller(
    {"action": "visit_url", "url": "https://example.com"}
)
```

## Workflow Diagram

```mermaid
sequenceDiagram
    participant Agent
    participant Controller
    participant Browser
    participant Page
    
    Agent->>Controller: Execute action
    Controller->>Controller: Validate parameters
    Controller->>Browser: Get page instance
    Browser->>Page: Perform action
    Page-->>Browser: Action result
    Browser-->>Controller: Browser response
    Controller->>Controller: Process result
    Controller->>Agent: Action result
    
    Note over Agent,Page: Repeat until task complete
```

## Frontend Integration

The browser automation system integrates with the Magentic-UI frontend through:

**DetailViewer Component:**
- Real-time browser view in the UI
- Screenshot gallery display
- Live action feed
- Control mode overlay for human takeover

**Modal Components:**
- BrowserModal for full-screen viewing
- VNC connection handling
- Pause and resume controls

资料来源：[frontend/src/components/views/chat/detail_viewer.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/views/chat/detail_viewer.tsx)

## Security Considerations

**URL Filtering:**
- `UrlStatusManager` validates navigation targets
- Configurable allow/reject lists
- Prevents navigation to untrusted domains

**Sandbox Isolation:**
- Docker containers provide process isolation
- Restricted network access when needed
- Resource limits prevent runaway processes

**Approval Workflows:**
- Human approval for sensitive actions
- Configurable approval thresholds
- Audit logging of all actions

## Conclusion

The Browser Automation system provides a robust, flexible foundation for web interaction in Magentic-UI. By combining Playwright's powerful browser control with thoughtful abstractions and multiple deployment options, it enables AI agents to perform complex web-based tasks reliably and safely.

---

<a id='docker-containers'></a>

## Docker Containers

### 相关页面

相关主题：[Getting Started with Magentic-UI](#getting-started), [Browser Automation](#browser-automation)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [docker/magentic-ui-browser-docker/Dockerfile](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/Dockerfile)
- [docker/magentic-ui-browser-docker/supervisord.conf](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/supervisord.conf)
- [docker/magentic-ui-browser-docker/playwright-server.js](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/playwright-server.js)
- [docker/magentic-ui-python-env/Dockerfile](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-python-env/Dockerfile)
- [docker/build-all.sh](https://github.com/microsoft/magentic-ui/blob/main/docker/build-all.sh)
- [src/magentic_ui/_docker.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_docker.py)
</details>

# Docker Containers

Magentic-UI leverages Docker containers to provide isolated, reproducible environments for running browser automation and code execution tasks. This architecture enables the application to execute complex multi-agent workflows while maintaining system-level isolation and consistent runtime dependencies.

## Architecture Overview

Magentic-UI uses two primary Docker images working in tandem to deliver its functionality:

```mermaid
graph TB
    subgraph "Magentic-UI Architecture"
        A["Frontend UI<br/>(localhost:8081)"] --> B["Backend API<br/>(Python/FastAPI)"]
        B --> C["Browser Container<br/>(VNC + Playwright)"]
        B --> D["Python Environment Container<br/>(Code Execution)"]
    end
    
    subgraph "Container Communication"
        C <-->|"WebSocket/REST"| B
        D <-->|"STDIO/REST"| B
    end
```

## Docker Image Types

| Image Type | Purpose | Key Components |
|------------|---------|----------------|
| `magentic-ui-browser` | Browser automation and web interaction | VNC Server, noVNC, Playwright, Chromium |
| `magentic-ui-python-env` | Safe Python code execution | Python runtime, isolated environment |

资料来源：[docker/build-all.sh:1-20]()

### Browser Docker Container

The browser container provides a full graphical environment for web surfing agents. It includes:

- **VNC Server**: Provides virtual display access
- **noVNC**: Web-based VNC client for browser access
- **Playwright**: Browser automation framework for programmatic control
- **Chromium**: Headless-capable web browser

资料来源：[docker/magentic-ui-browser-docker/Dockerfile](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/Dockerfile)

### Python Environment Docker Container

The Python environment container provides a sandboxed environment for executing user-generated Python code safely:

- Isolated Python runtime
- Restricted file system access
- Controlled network access
- Independent package management

资料来源：[docker/magentic-ui-python-env/Dockerfile](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-python-env/Dockerfile)

## Docker Initialization Workflow

```mermaid
sequenceDiagram
    participant User
    participant CLI
    participant Docker Daemon
    participant Registry
    
    User->>CLI: magentic-ui --port 8081
    CLI->>Docker Daemon: Check if Docker is running
    Docker Daemon-->>CLI: Docker Status
    
    alt Docker not running
        CLI->>User: Error: Please start Docker
    else Docker running
        CLI->>Docker Daemon: Check browser image exists
        Docker Daemon-->>CLI: Image Status
        
        alt Image missing
            CLI->>Registry: Pull browser image
            Registry-->>Docker Daemon: Image layers
            Docker Daemon->>Docker Daemon: Build image
        end
        
        CLI->>Docker Daemon: Check Python image exists
        Docker Daemon-->>CLI: Image Status
        
        alt Image missing
            CLI->>Registry: Pull Python image
            Registry-->>Docker Daemon: Image layers
            Docker Daemon->>Docker Daemon: Build image
        end
        
        CLI->>User: Magentic-UI ready
    end
```

资料来源：[src/magentic_ui/_docker.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_docker.py)

## Container Management Functions

The `src/magentic_ui/_docker.py` module provides core Docker management functionality:

| Function | Purpose |
|----------|---------|
| `check_docker_running()` | Verifies Docker daemon is accessible |
| `check_browser_image()` | Checks if browser Docker image exists locally |
| `check_python_image()` | Checks if Python environment Docker image exists locally |
| `pull_browser_image()` | Pulls/updates the browser Docker image |
| `pull_python_image()` | Pulls/updates the Python environment Docker image |

资料来源：[src/magentic_ui/_docker.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_docker.py)

## Build Process

The build script `docker/build-all.sh` constructs both Docker images:

```bash
# Build browser Docker image
docker build -t magentic-ui-browser ./magentic-ui-browser-docker

# Build Python environment Docker image
docker build -t magentic-ui-python-env ./magentic-ui-python-env
```

资料来源：[docker/build-all.sh](https://github.com/microsoft/magentic-ui/blob/main/docker/build-all.sh)

## Browser Container Services

The browser container runs multiple services managed by supervisord:

```mermaid
graph LR
    subgraph "Browser Container Services"
        A["supervisord<br/>(Process Manager)"]
        A --> B["Xvfb<br/>(Virtual Display)"]
        A --> C["x11vnc<br/>(VNC Server)"]
        A --> D["noVNC<br/>(Web VNC)"]
        A --> E["Playwright<br/>(Browser Control)"]
    end
```

### Service Configuration

The browser container uses `supervisord.conf` for service orchestration:

- **Process Management**: Supervisord manages all background services
- **Auto-restart**: Services automatically restart on failure
- **Log Management**: Centralized logging configuration

资料来源：[docker/magentic-ui-browser-docker/supervisord.conf](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/supervisord.conf)

### Playwright Server

The Playwright server (`playwright-server.js`) provides HTTP API access to browser automation:

```javascript
// Server initialization with browser configuration
// Handles browser launching, page creation, and element interaction
```

资料来源：[docker/magentic-ui-browser-docker/playwright-server.js](https://github.com/microsoft/magentic-ui/blob/main/docker/magentic-ui-browser-docker/playwright-server.js)

## Running Without Docker

For environments where Docker is unavailable, Magentic-UI supports a limited mode:

```bash
magentic-ui --run-without-docker --port 8081
```

**Limitations in No-Docker Mode**:

| Feature | With Docker | Without Docker |
|---------|-------------|----------------|
| Web Surfing | Full browser automation | Not available |
| Code Execution | Isolated sandbox | Not available |
| File Handling | Enhanced isolation | Basic support |
| Agent Capabilities | Complete | Reduced |

资料来源：[README.md](https://github.com/microsoft/magentic-ui/blob/main/README.md)

## Prerequisites

### System Requirements

| Requirement | Minimum | Recommended |
|-------------|---------|-------------|
| Docker Version | Latest stable | Latest stable |
| Python | 3.10+ | 3.11+ |
| RAM | 4GB | 8GB+ |
| Disk Space | 2GB | 5GB+ |

### Platform Support

- **Linux**: Full support with native Docker
- **macOS**: Full support with Docker Desktop
- **Windows**: WSL2 required for Docker support

资料来源：[TROUBLESHOOTING.md](https://github.com/microsoft/magentic-ui/blob/main/TROUBLESHOOTING.md)

## Troubleshooting

### Common Docker Issues

| Issue | Symptom | Solution |
|-------|---------|----------|
| Docker not running | "Docker is not running" error | Start Docker Desktop/daemon |
| Image pull failure | Timeout during first run | Run `docker/build-all.sh` manually |
| Port conflict | Container fails to start | Change port with `--port` flag |

### Verification Commands

```bash
# Check Docker is running
docker info

# Verify images exist
docker images | grep magentic-ui

# Manually build images
cd docker && sh build-all.sh
```

资料来源：[TROUBLESHOOTING.md](https://github.com/microsoft/magentic-ui/blob/main/TROUBLESHOOTING.md)

## Configuration

### Environment Variables

| Variable | Purpose | Default |
|----------|---------|---------|
| `NOVNC_PORT` | noVNC web interface port | 6080 |
| `PLAYWRIGHT_PORT` | Playwright API port | 8080 |
| `PYTHON_ENV_PORT` | Python execution port | 8082 |

### Workspace Configuration

The CLI manages workspace paths passed to containers:

```python
workspace_config = {
    "internal_workspace_root": "/path/to/internal",
    "external_workspace_root": "/path/to/external",
    "inside_docker": True,
    "config": {...},
    "run_without_docker": False
}
```

资料来源：[src/magentic_ui/backend/cli.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/cli.py)

## Security Considerations

### Container Isolation

- **Network Isolation**: Containers communicate via internal bridge network
- **File System Isolation**: Read-only base images with volume mounts for data
- **Process Isolation**: Separate PID namespaces

### Best Practices

1. Always run Docker with non-root user when possible
2. Keep Docker images updated with latest security patches
3. Use the provided workspace paths for file operations
4. Monitor container resource usage

---

<a id='configuration'></a>

## Configuration

### 相关页面

相关主题：[Getting Started with Magentic-UI](#getting-started)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this documentation:

- [src/magentic_ui/_cli.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_cli.py)
- [src/magentic_ui/backend/cli.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/cli.py)
- [src/magentic_ui/backend/web/config.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/config.py)
- [src/magentic_ui/backend/teammanager/teammanager.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/teammanager/teammanager.py)
- [fara_config.yaml](https://github.com/microsoft/magentic-ui/blob/main/fara_config.yaml)
- [frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)
- [src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)
- [frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)
</details>

# Configuration

Magentic-UI provides a multi-layered configuration system that spans both the backend (Python) and frontend (React/TypeScript) layers. The system handles environment-based settings, agent parameters, UI theming, and server configurations through YAML files, environment variables, and component-level props.

## Overview

The configuration architecture in Magentic-UI can be visualized as follows:

```mermaid
graph TD
    A[Configuration Sources] --> B[Backend CLI]
    A --> C[Environment Variables]
    A --> D[YAML Config Files]
    A --> E[Frontend React Components]
    
    B --> F[Server Initialization]
    C --> G[API URL Configuration]
    D --> H[Agent Parameters]
    E --> I[UI Theme & Modal Settings]
    
    F --> J[Backend Server Running on Port 8081]
    G --> K[Frontend Dev Server]
    H --> L[Web Surfer Agent]
    I --> M[User Interface]
```

## Backend Configuration

### CLI Entry Point

The main CLI entry point in `src/magentic_ui/_cli.py` serves as the primary configuration bootstrap for the backend server. It handles argument parsing and delegates to the backend CLI module.

**Key configuration parameters supported:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `--host` | string | Server host address |
| `--port` | integer | Server port number (default: 8081) |
| `--config` | string | Path to YAML configuration file |
| `--debug` | boolean | Enable debug mode |

资料来源：[src/magentic_ui/_cli.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/_cli.py)

### Web Server Configuration

The web server configuration module (`src/magentic_ui/backend/web/config.py`) defines the core server settings used by the FastAPI-based backend.

```python
class ServerConfig:
    host: str = "0.0.0.0"
    port: int = 8081
    cors_origins: list[str] = ["http://localhost:8000"]
    debug: bool = False
```

**Configuration Options:**

| Option | Default | Description |
|--------|---------|-------------|
| `host` | `"0.0.0.0"` | Bind address for the server |
| `port` | `8081` | HTTP port for the backend API |
| `cors_origins` | `["http://localhost:8000"]` | Allowed CORS origins |
| `debug` | `false` | Enable verbose logging and hot reload |

资料来源：[src/magentic_ui/backend/web/config.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/web/config.py)

### Team Manager Configuration

The `teammanager.py` module handles multi-agent orchestration configuration. It manages agent teams and their coordination settings.

**Key configuration aspects:**

- Agent pool sizing
- Maximum concurrent agents
- Communication protocols between agents
- Timeout settings for agent operations

资料来源：[src/magentic_ui/backend/teammanager/teammanager.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/backend/teammanager/teammanager.py)

## YAML Configuration Files

### fara_config.yaml

The `fara_config.yaml` file contains configuration for the web surfer agent, including display settings and browser automation parameters.

```yaml
display_width_px: 1280
display_height_px: 720
include_input_text_key_args: false
```

**Web Surfer Agent Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `display_width_px` | integer | Browser viewport width in pixels |
| `display_height_px` | integer | Browser viewport height in pixels |
| `include_input_text_key_args` | boolean | Include text input keyboard shortcuts |

资料来源：[fara_config.yaml](https://github.com/microsoft/magentic-ui/blob/main/fara_config.yaml)

### Agent Parameter Handling

The `_prompts.py` module in the web surfer agent demonstrates how configuration is consumed:

```python
def __init__(self, cfg=None):
    self.display_width_px = cfg["display_width_px"]
    self.display_height_px = cfg["display_height_px"]
    include_input_text_key_args = cfg.pop("include_input_text_key_args", False)
    if not include_input_text_key_args:
        self.parameters["properties"].pop("press_enter", None)
        self.parameters["properties"].pop("delete_existing_text", None)
    super().__init__(cfg)
```

资料来源：[src/magentic_ui/agents/web_surfer/fara/_prompts.py](https://github.com/microsoft/magentic-ui/blob/main/src/magentic_ui/agents/web_surfer/fara/_prompts.py)

## Frontend Configuration

### Environment Variables

The frontend uses environment variables configured through a `.env` file structure. The development environment requires specific settings to connect to the backend API.

**Setup Instructions:**

1. Copy `.env.default` to `.env.development`
2. Set the required variables in the new file

| Variable | Required Value | Description |
|----------|---------------|-------------|
| `GATSBY_API_URL` | `http://localhost:8081/api` | Backend API endpoint |
| `GATSBY_WS_URL` | `ws://localhost:8081/ws` | WebSocket endpoint (if applicable) |

资料来源：[frontend/README.md](https://github.com/microsoft/magentic-ui/blob/main/frontend/README.md)

### API Configuration Flow

```mermaid
sequenceDiagram
    participant FE as Frontend (React)
    participant API as Backend API
    participant WS as WebSocket
    
    FE->>FE: Load .env.development
    FE->>API: HTTP requests to GATSBY_API_URL
    FE->>WS: WebSocket connections
    API-->>FE: JSON responses
    WS-->>FE: Real-time updates
```

### MCP Server Configuration

The MCP (Model Context Protocol) server configuration modal provides a UI for managing external server integrations.

**Supported Connection Types:**

| Type | Description | Configuration Method |
|------|-------------|----------------------|
| `SSE` | Server-Sent Events | Form-based input |
| `Stdio` | Standard I/O | Form-based input |
| `JSON` | Raw JSON Config | Direct JSON editing |

**Server Configuration Validation:**

| Field | Validation Rule |
|-------|----------------|
| `serverName` | Required, alphanumeric characters only, max 50 characters |
| `serverName` | Must be unique across all servers |

```typescript
// Example validation logic from McpConfigModal.tsx
const serverNameError = !serverName || !/^[a-zA-Z0-9]+$/.test(serverName);
const serverNameDuplicateError = existingServers.some(
  (s) => s.name === serverName && s.id !== server?.id
);
```

资料来源：[frontend/src/components/features/McpServersConfig/McpConfigModal.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/features/McpServersConfig/McpConfigModal.tsx)

## UI Theme Configuration

### Theme Application

The main layout component applies theme settings based on user preferences and system defaults:

```typescript
<ConfigProvider
  theme={{
    algorithm: darkMode === "dark" 
      ? theme.darkAlgorithm 
      : theme.defaultAlgorithm,
  }}
>
```

**Theme Options:**

| Mode | Algorithm | CSS Classes |
|------|-----------|-------------|
| Light | `defaultAlgorithm` | `bg-white`, `text-gray-900` |
| Dark | `darkAlgorithm` | `bg-gray-900`, `text-gray-100` |

资料来源：[frontend/src/components/layout.tsx](https://github.com/microsoft/magentic-ui/blob/main/frontend/src/components/layout.tsx)

### Component-Level Styling

Components use Tailwind CSS utility classes for configuration of:

- Color schemes (`bg-magenta-800`, `text-blue-400`)
- Spacing (`p-3`, `mt-4`, `mb-2`)
- Typography (`text-sm`, `font-medium`)
- Transitions (`transition-colors`, `transition-all duration-300`)

## Configuration Workflow

```mermaid
graph LR
    A[Start Application] --> B{Backend or Frontend?}
    
    B -->|Backend| C[Load CLI Args]
    B -->|Backend| D[Parse YAML Config]
    B -->|Backend| E[Initialize Server]
    
    B -->|Frontend| F[Load .env.development]
    B -->|Frontend| G[Build API URL]
    B -->|Frontend| H[Render UI Components]
    
    C --> E
    D --> E
    F --> G
    G --> H
    E --> I[Server Ready]
    H --> J[User Interface Ready]
```

## Configuration Files Summary

| File Path | Purpose | Format |
|-----------|---------|--------|
| `src/magentic_ui/_cli.py` | Main CLI entry point | Python |
| `src/magentic_ui/backend/cli.py` | Backend CLI logic | Python |
| `src/magentic_ui/backend/web/config.py` | Web server settings | Python (dataclass) |
| `src/magentic_ui/backend/teammanager/teammanager.py` | Agent orchestration | Python |
| `fara_config.yaml` | Web surfer agent settings | YAML |
| `.env.development` | Frontend environment | Environment Variables |

## Best Practices

1. **Environment Isolation**: Keep development and production environment files separate
2. **Validation**: Always validate MCP server names against alphanumeric patterns
3. **CORS Settings**: Ensure backend CORS configuration matches frontend origin
4. **Port Consistency**: The frontend expects the backend at `http://localhost:8081/api`
5. **Theme Persistence**: User theme preferences should be stored in local storage or user profile

---

---

## Doramagic 踩坑日志

项目：microsoft/magentic-ui

摘要：发现 13 个潜在踩坑项，其中 0 个为 high/blocking；最高优先级：安装坑 - 来源证据：Create tutorials and documentation for the codebase。

## 1. 安装坑 · 来源证据：Create tutorials and documentation for the codebase

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Create tutorials and documentation for the codebase
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_c0979f7ebb064422a6a8095561f6a9bd | https://github.com/microsoft/magentic-ui/issues/154 | 来源类型 github_issue 暴露的待验证使用条件。

## 2. 安装坑 · 来源证据：Support Podman in place of Docker

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Support Podman in place of Docker
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f88231cc4cad442ca53d92ea3a40a655 | https://github.com/microsoft/magentic-ui/issues/312 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 3. 安装坑 · 来源证据：magentic-ui can't display all the html element

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：magentic-ui can't display all the html element
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_21f19953edd74379ab2d25cedc37ca1b | https://github.com/microsoft/magentic-ui/issues/362 | 来源讨论提到 docker 相关条件，需在安装/试用前复核。

## 4. 配置坑 · 来源证据：Refreshing or restart the web app will make the current Session unavailable

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Refreshing or restart the web app will make the current Session unavailable
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_83a9cafd59254028853ef84cd1ccc756 | https://github.com/microsoft/magentic-ui/issues/336 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 5. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:978331188 | https://github.com/microsoft/magentic-ui | README/documentation is current enough for a first validation pass.

## 6. 运行坑 · 来源证据：Why not conduct a requirement analysis before the plan?

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：Why not conduct a requirement analysis before the plan?
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6003a9c2194f40c0865145385cf98c32 | https://github.com/microsoft/magentic-ui/issues/321 | 来源类型 github_issue 暴露的待验证使用条件。

## 7. 维护坑 · 来源证据：Sticked at click the “Shopping Cart” icon and cannot goto check out page

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：Sticked at click the “Shopping Cart” icon and cannot goto check out page
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_7e754869326e42e1a7c57f3a1962ef9e | https://github.com/microsoft/magentic-ui/issues/360 | 来源类型 github_issue 暴露的待验证使用条件。

## 8. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:978331188 | https://github.com/microsoft/magentic-ui | last_activity_observed missing

## 9. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:978331188 | https://github.com/microsoft/magentic-ui | no_demo; severity=medium

## 10. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:978331188 | https://github.com/microsoft/magentic-ui | no_demo; severity=medium

## 11. 安全/权限坑 · 来源证据：Settings redesign

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Settings redesign
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6a2eeae98b6d4fdab476464d57e64e1d | https://github.com/microsoft/magentic-ui/issues/227 | 来源类型 github_issue 暴露的待验证使用条件。

## 12. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:978331188 | https://github.com/microsoft/magentic-ui | issue_or_pr_quality=unknown

## 13. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:978331188 | https://github.com/microsoft/magentic-ui | release_recency=unknown

<!-- canonical_name: microsoft/magentic-ui; human_manual_source: deepwiki_human_wiki -->