Magentic-UI Capability Pack Manual

Doramagic Project Pack · Human Manual

Magentic-UI Capability Pack

Magentic-UI is a web-based interface that enables users to create, manage, and execute AI-driven task automation workflows. The system combines a React-based frontend with a Python backend...

Getting Started with Magentic-UI

Related topics: Configuration, Docker Containers

Section Related Pages

Continue reading this section for the full explanation and source context.

Section System Dependencies

Continue reading this section for the full explanation and source context.

Section Standard Installation

Continue reading this section for the full explanation and source context.

Section Installation with Fara-7B Support

Continue reading this section for the full explanation and source context.

Related topics: Configuration, Docker Containers

Getting Started with Magentic-UI

Magentic-UI is a Microsoft open-source project that provides a multi-agent framework for building AI-powered user interfaces. It enables developers to create intelligent agents that can browse the web, execute plans, handle file uploads, and interact with users through a web-based chat interface.

Prerequisites

Before getting started with Magentic-UI, ensure your system meets the following requirements:

Requirement	Version/Details
Python	3.10 or higher
Docker	Latest stable version
Node.js	For frontend development
pip	Latest version

Sources: README.md:1-20

System Dependencies

Docker: Required for running agent containers that provide browser automation capabilities
Node.js: Needed only if you plan to modify the frontend code

Installation

Standard Installation

The simplest way to install Magentic-UI is via pip:

pip install magentic-ui

Sources: README.md:25-30

Installation with Fara-7B Support

To use the Fara-7B model locally, install with the fara extras:

python3 -m venv .venv
source .venv/bin/activate
pip install magentic-ui[fara]

Sources: README.md:55-60

Quick Start with Docker

After installation, start Magentic-UI using Docker:

magentic-ui --port 8081

Note: Running this command for the first time will pull two Docker images required for the Magentic-UI agents. If you encounter problems, you can build them directly with:

cd docker
sh build-all.sh

Once the server is running, access the UI at http://localhost:8081.

Sources: README.md:30-45

Local Development Setup

Backend Development

For local backend development, clone the repository and set up the environment:

git clone https://github.com/microsoft/magentic-ui.git
cd magentic-ui
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Run the development server:

magentic ui --port 8081

Sources: TROUBLESHOOTING.md:1-20

Frontend Development

The frontend is located in the frontend/ directory of the repository. To set up for development:

Navigate to the frontend directory:

cd frontend

Create environment configuration:

cp .env.default .env.development

Configure the API URL:

Edit .env.development and set:

GATSBY_API_URL=http://localhost:8081/api

Sources: frontend/README.md:1-15

Connecting Frontend to Backend

The frontend makes requests to the backend API expecting responses at http://localhost:8081/api. Ensure GATSBY_API_URL is correctly set in your environment configuration.

Sources: frontend/README.md:10-12

Using Fara-7B Locally

To run Magentic-UI with a local Fara-7B model:

Step 1: Serve the Model

In a separate process, serve the Fara-7B model using vLLM:

vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto

Step 2: Create Configuration

Create a fara_config.yaml file with the following content:

model_config_local_surfer: &client_surfer
  provider: OpenAIChatCompletionClient
  config:
    model: "microsoft/Fara-7B"
    base_url: http://localhost:5000/v1
    api_key: not-needed
    model_info:
      vision: true
      function_calling: true
      json_output: false
      family: "unknown" 
      structured_output: false
      multiple_system_messages: false

orchestrator_client: *client

Sources: README.md:60-80

Core Features

Web Surfer Agent

The web surfer agent enables browsing and interacting with web pages. Available actions include:

Action	Description
`key`	Performs key presses (Enter, Alt, Shift, Tab, etc.)
`type`	Types text into input fields
`mouse_move`	Moves cursor to specified pixel coordinates
`left_click`	Clicks the left mouse button
`scroll`	Scrolls the mouse wheel
`visit_url`	Navigates to a specified URL
`web_search`	Performs a web search
`history_back`	Goes back in browser history
`pause_and_memorize_fact`	Stores information for future reference
`wait`	Waits for specified seconds
`terminate`	Ends the current task

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-60

Plans System

Magentic-UI supports creating and managing reusable plans:

Create Plans: Build step-by-step task plans through the UI
Attach Plans: Attach saved plans to queries for execution
Learn Plans: Save successful conversation patterns as reusable plans
Import/Export: Import existing plans or export your library

Sources: frontend/src/components/features/Plans/PlanCard.tsx:1-50

File Handling

The system supports:

Arbitrary file uploads
File preview and download
Multiple file type support (images, documents, etc.)

Sources: frontend/src/components/common/filerenderer.tsx:1-80

MCP Server Integration

Magentic-UI supports Model Context Protocol (MCP) servers with multiple connection types:

Connection Type	Description
SSE	Server-Sent Events connection
Stdio	Standard input/output connection
JSON Config	JSON-based configuration file

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-60

Project Structure

magentic-ui/
├── frontend/                    # React frontend application
│   ├── src/
│   │   ├── components/         # Reusable UI components
│   │   │   ├── features/       # Feature-specific components
│   │   │   ├── views/          # View components
│   │   │   └── common/         # Common utilities
│   │   └── pages/              # Page components
│   └── README.md               # Frontend development guide
├── src/
│   └── magentic_ui/            # Main Python package
│       └── agents/             # Agent implementations
├── docker/                      # Docker configuration
├── README.md                    # Main documentation
├── CONTRIBUTING.md              # Contribution guidelines
└── TROUBLESHOOTING.md           # Issue resolution guide

Troubleshooting

Common Issues

#### Port Already in Use

If port 8081 is occupied, either stop the existing service or use a different port:

magentic ui --port 8082

#### Virtual Environment Activation

If you installed in a virtual environment but it didn't activate:

deactivate
source .venv/bin/activate
magentic ui --port 8081

#### Wrong Package Installed

Ensure you installed magentic-ui (not the unrelated magentic package):

pip install magentic-ui

Getting Help

If issues persist:

Double-check all prerequisites in the README
Search GitHub Issues for similar problems
Open a new issue with:

Detailed problem description
System information (OS, Docker version)
Steps to replicate

Sources: TROUBLESHOOTING.md:1-50

Contributing

We welcome community contributions:

Find an Issue: Browse All Issues and look for help-wanted labeled items
Fork and Clone: Fork the repository and clone locally
Create a Branch: Use descriptive names (e.g., fix/session-bug or feature/file-upload)
Write Code and Tests: Include tests for new features
Run Checks: Before submitting PR, run:

poe check

Submit PR: Open against main branch and reference the issue number

Top "Help Wanted" Issues

Issue	Description
#132	Allow MAGUI to understand video and audio
#128	Enable arbitrary file upload in UI
#126	Add streaming of final answer and coder messages
#123	Add unit tests
#124	Allow websurfer to scroll inside containers

Sources: CONTRIBUTING.md:1-40

Architecture Overview

graph TD
    A[User Interface] --> B[Frontend React App]
    B --> C[Backend API]
    C --> D[Magentic-UI Agents]
    D --> E[Web Surfer Agent]
    D --> F[Plans Executor]
    D --> G[MCP Servers]
    E --> H[Docker Containers]
    H --> I[Browser Automation]
    F --> J[Task Execution]
    G --> K[External Tools]

Configuration Reference

Environment Variables (Frontend)

Variable	Description	Default
`GATSBY_API_URL`	Backend API endpoint	`http://localhost:8081/api`

Docker Ports

Port	Service
8081	Main application (configurable)
5000	vLLM model server (Fara-7B)

Sources: README.md:1-20

Key Concepts

Related topics: High-Level Architecture, Agent System

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: High-Level Architecture, Agent System

Key Concepts

Magentic-UI is an interactive UI framework that enables AI agents to execute tasks with human oversight. This document explains the fundamental concepts that power the system's architecture, including approval workflows, guarded actions, plan management, and the learning system.

Source: https://github.com/microsoft/magentic-ui / Human Manual

High-Level Architecture

Related topics: Agent System, Team Orchestration, Backend API

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Frontend Architecture

Continue reading this section for the full explanation and source context.

Section Core Frontend Components

Continue reading this section for the full explanation and source context.

Section Chat System

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Team Orchestration, Backend API

High-Level Architecture

Overview

Magentic-UI is a web-based interface that enables users to create, manage, and execute AI-driven task automation workflows. The system combines a React-based frontend with a Python backend to provide an interactive chat interface where users can execute multi-step plans, browse the web autonomously, and manage reusable workflow templates.

The architecture follows a client-server model where:

The frontend handles UI rendering, user interaction, and real-time display of task execution
The backend processes AI requests, manages agent execution, and coordinates with external tools
Communication occurs via RESTful API endpoints

System Components

Frontend Architecture

The frontend is a React application using TypeScript, organized into a modular component structure. It communicates with the backend API at http://localhost:8081/api as specified in the environment configuration.

Sources: frontend/README.md:1-7

graph TD
    A[User Browser] --> B[React Frontend]
    B --> C[Components]
    C --> D[Views/Chat]
    C --> E[Features]
    E --> F[Plans]
    E --> G[McpServersConfig]
    C --> H[Layout]
    B --> I[Backend API<br/>localhost:8081/api]
    I --> J[Python Backend]
    J --> K[AI Agents]
    J --> L[Team Manager]

Core Frontend Components

Component	File Path	Purpose
`Chat`	`frontend/src/components/views/chat/chat.tsx`	Main chat interface container
`RunView`	`frontend/src/components/views/chat/runview.tsx`	Manages run status and detail viewer
`RenderMessage`	`frontend/src/components/views/chat/rendermessage.tsx`	Renders different message types
`ChatInput`	`frontend/src/components/views/chat/chatinput.tsx`	User input with file/plan attachment
`DetailViewer`	`frontend/src/components/views/chat/detail_viewer.tsx`	Browser view, screenshots, live tabs
`ProgressBar`	`frontend/src/components/views/chat/progressbar.tsx`	Task progress visualization

Sources: frontend/src/components/views/chat/chat.tsx:1-50

Frontend View Architecture

Chat System

The chat system is the primary user interaction point. It manages:

Display of conversation messages
Real-time run status updates
Progress tracking for multi-step tasks
Detail viewer integration for visual feedback

graph LR
    A[ChatInput] -->|User Input| B[Chat Container]
    B --> C[RenderMessage]
    C -->|Multi-modal| D[PlanView]
    C -->|File| E[FileRenderer]
    B -->|Active Run| F[RunView]
    F --> G[DetailViewer]
    G --> H[Screenshots]
    G --> I[Live View]
    G --> J[BrowserModal]

The chat component handles multiple run states:

active - Task is currently executing
awaiting_input - Waiting for user response
paused - Task temporarily paused
pausing - Pause operation in progress

Sources: frontend/src/components/views/chat/chat.tsx:30-40

Plan Management System

Plans are reusable workflow templates that define multi-step task sequences. The plan system consists of:

Component	Function
`PlanList`	Displays all saved plans with search/filter
`PlanCard`	Individual plan summary with quick actions
`PlanView`	Detailed plan editing and viewing
`LearnPlanButton`	Creates new plans from conversation history

graph TD
    A[User Conversation] --> B[LearnPlanButton]
    B --> C[Plan Created]
    C --> D[PlanList]
    D --> E[PlanCard]
    E --> F[PlanView<br/>Edit/View]
    F --> G[Save Plan]
    G --> D
    A --> H[Attach Plan to Chat]
    H --> I[Execute Plan]

Plans can be attached to chat queries using the dropdown menu in ChatInput, allowing users to:

Create new empty plans
Import plans from JSON files
Search through existing plans
Execute attached plans with the current query

Sources: frontend/src/components/features/Plans/PlanCard.tsx:1-80 Sources: frontend/src/components/features/Plans/PlanList.tsx:1-50

Detail Viewer System

The detail viewer provides visual feedback during task execution through multiple tabs:

Tab	Purpose
Screenshots	Static screenshots captured during execution
Live	Real-time browser view via noVNC
Browser Modal	Full browser view in modal overlay

The system supports control handover, allowing users to take control during autonomous browsing:

graph TD
    A[Agent Execution] --> B[DetailViewer]
    B --> C[ScreenshotsTab]
    B --> D[LiveTab]
    D --> E[noVNC Connection]
    E --> F[User Control<br/>Handover]
    F --> G[FullscreenOverlay]
    G --> H[User Input Response]
    H --> A

Sources: frontend/src/components/views/chat/detail_viewer.tsx:1-100 Sources: frontend/src/components/views/chat/runview.tsx:1-50

MCP Server Configuration

Magentic-UI supports the Model Context Protocol (MCP) for extending functionality through external servers. The configuration modal supports three connection types:

Connection Type	Description
SSE	Server-Sent Events for streaming responses
Stdio	Standard input/output process communication
JSON Config	Direct JSON configuration import

interface MCPConfig {
  serverName: string;      // Unique identifier
  connectionType: 'sse' | 'stdio' | 'json';
  // Additional config based on type...
}

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-100

Progress Tracking System

The progress bar component provides real-time feedback on task execution:

graph TD
    A[Progress Update] --> B{Has Final Answer?}
    B -->|Yes| C[100% Complete<br/>Green Bar]
    B -->|No| D[Calculate Percentage]
    D --> E[Current Step Highlight<br/>Magenta]
    E --> F[Remaining Steps<br/>Gray]
    C --> G[Status: Task Completed]
    F --> H[Status: Step X of Y]

The progress system tracks:

currentStep - Current execution step index
totalSteps - Total number of steps in the plan
plan.steps - Array of step definitions with titles

Sources: frontend/src/components/views/chat/progressbar.tsx:1-80

Message Rendering System

Messages are rendered based on their content type:

graph TD
    A[Raw Message] --> B{Parse Content}
    B --> C{Multi-Modal?}
    C -->|Yes| D[Map Each Item]
    C -->|No| E[Render as Text]
    D --> F{Is String?}
    F -->|Yes| G[Parse & Display]
    F -->|No| H[Skip/Empty]
    G --> I{Has Plan?}
    I -->|Yes| J[PlanView Component]
    J --> K[Final Output]
    H --> K
    E --> K

Supported content types:

Plain text with markdown rendering
Multi-step plans
File attachments with download capability
Code blocks

Sources: frontend/src/components/views/chat/rendermessage.tsx:1-100

Web Surfer Agent

The web surfer agent enables autonomous web browsing. It uses a structured action system:

Action	Purpose	Required Parameters
`visit_url`	Navigate to URL	`url`
`web_search`	Search the web	`query`
`scroll`	Scroll page	`pixels`
`click`	Click element	`coordinate`
`type`	Input text	`text`, `coordinate`
`pause_and_memorize_fact`	Store information	`fact`
`wait`	Wait for page	`time`
`terminate`	End task	`status`

parameters = {
    "action": {
        "type": "string",
        "enum": ["visit_url", "web_search", "scroll", "click", ...]
    },
    "coordinate": {
        "description": "(x, y): The x and y coordinates for mouse actions",
        "type": "array"
    }
}

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-80

Data Flow

User Query Flow

sequenceDiagram
    participant User
    participant ChatInput
    participant Backend
    participant Agent
    participant DetailViewer
    
    User->>ChatInput: Submit query
    ChatInput->>Backend: POST /api/run
    Backend->>Agent: Execute task
    Agent->>DetailViewer: Stream screenshots
    Agent->>Backend: Progress updates
    Backend->>ChatInput: Status updates
    Agent->>Backend: Final response
    Backend->>User: Display result

Plan Attachment Flow

graph TD
    A[User clicks attach] --> B[Dropdown shows plans]
    B --> C[Select plan]
    C --> D[PlanView Modal opens]
    D --> E[Confirm attachment]
    E --> F[Plan attached to input]
    F --> G[Submit with plan context]

State Management

The frontend manages several key state objects:

interface ChatState {
  currentRun: Run | null;
  runStatus: 'idle' | 'active' | 'paused' | 'awaiting_input';
  progress: {
    currentStep: number;
    totalSteps: number;
    plan?: Plan;
  };
  hasFinalAnswer: boolean;
}

interface PlanState {
  task: string;
  steps: Step[];
  created_at?: string;
}

Security Considerations

The layout includes a disclaimer for user awareness:

Magentic-UI can make mistakes. Please monitor its work and intervene if necessary.

Control handover features allow users to take control from autonomous agents during execution, ensuring human oversight of automated tasks.

Sources: frontend/src/components/layout.tsx:1-50

Configuration

Environment Variables

Variable	Default	Purpose
`GATSBY_API_URL`	`http://localhost:8081/api`	Backend API endpoint
API requests target	`http://localhost:8081/api`	Frontend-backend communication

Theme Support

The application supports light and dark themes using Ant Design's theme algorithm system, dynamically switching between darkAlgorithm and defaultAlgorithm based on user preference.

Sources: frontend/README.md:1-10 Sources: frontend/src/components/layout.tsx:1-30

Sources: frontend/README.md:1-7

Agent System

Related topics: High-Level Architecture, Team Orchestration, Browser Automation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Web Surfer Agent

Continue reading this section for the full explanation and source context.

Section File Surfer Agent

Continue reading this section for the full explanation and source context.

Section Coder Agent

Continue reading this section for the full explanation and source context.

Agent System

Overview

The Agent System in Magentic-UI is a multi-agent orchestration framework that enables autonomous task execution through specialized agents. The system coordinates various agent types including web surfers, file surfers, coders, and user proxies to accomplish complex tasks requested by users.

Architecture Overview

graph TD
    User[User Request] --> Orchestrator[Orchestrator Agent]
    Orchestrator --> WebSurfer[Web Surfer Agent]
    Orchestrator --> FileSurfer[File Surfer Agent]
    Orchestrator --> Coder[Coder Agent]
    Orchestrator --> UserProxy[User Proxy Agent]
    
    WebSurfer --> Browser[Browser Control]
    FileSurfer --> FileSystem[File System]
    Coder --> CodeExecution[Code Executor]
    UserProxy --> UserApproval[User Approval]
    
    Browser --> StateUpdate[State Update]
    FileSystem --> StateUpdate
    CodeExecution --> StateUpdate
    UserApproval --> StateUpdate
    
    StateUpdate --> Orchestrator

Agent Types

Web Surfer Agent

The Web Surfer Agent enables autonomous web browsing by controlling a browser instance. It handles various browsing actions including navigation, interaction, and information retrieval.

#### Supported Actions

Action	Description	Required Parameters
`visit_url`	Navigate to a specific URL	`url`
`web_search`	Perform a web search	`query`
`left_click`	Click at coordinates	`coordinate`
`right_click`	Right-click at coordinates	`coordinate`
`mouse_move`	Move mouse to coordinates	`coordinate`
`type`	Type text into an element	`text`, `coordinate`
`scroll`	Scroll the page	`pixels`
`key`	Press keyboard keys	`keys`
`pause_and_memorize_fact`	Store information for later use	`fact`
`wait`	Wait for specified duration	`time`
`history_back`	Navigate back in browser history	-
`terminate`	End the browsing session	`status`

#### Configuration Parameters

The Web Surfer Agent supports the following configuration options:

{
    "display_width_px": int,      # Browser viewport width
    "display_height_px": int,    # Browser viewport height
    "include_input_text_key_args": bool  # Include type-specific arguments
}

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-80

File Surfer Agent

The File Surfer Agent provides file system navigation and file content interaction capabilities. It allows agents to read, write, and manage files within the project workspace.

Coder Agent

The Coder Agent handles code generation, analysis, and execution tasks. It works in conjunction with the orchestrator to implement requested functionality.

User Proxy Agent

The User Proxy Agent acts as an intermediary between the autonomous agent system and human users. It handles:

Requesting user confirmation for sensitive operations
Presenting information that requires human judgment
Managing user input during interactive sessions

Agent Communication Flow

sequenceDiagram
    participant User
    participant Frontend
    participant Orchestrator
    participant Agent
    
    User->>Frontend: Submit Task
    Frontend->>Orchestrator: Send Request
    Orchestrator->>Agent: Delegate Subtask
    Agent->>Agent: Execute Action
    Agent-->>Orchestrator: Return Result
    
    alt Requires Approval
        Orchestrator->>User: Request Approval
        User-->>Orchestrator: Approval/Denial
    end
    
    Orchestrator-->>Frontend: Final Response
    Frontend-->>User: Display Result

Task Execution Workflow

When a user submits a task through the chat interface, the system follows this execution model:

Task Submission: User enters a query via ChatInput component
Agent Selection: The orchestrator determines which agent(s) to invoke
Execution: Selected agents perform their designated actions
Progress Tracking: The system displays execution progress via ProgressBar
State Updates: Real-time updates are rendered via RenderMessage
Completion: Final results are presented with option to save plans

Sources: frontend/src/components/views/chat/chat.tsx:50-120

Plan System Integration

The Agent System integrates with a Plan System that breaks down complex tasks into executable steps:

graph LR
    Task[User Task] --> Plan[Generated Plan]
    Plan --> Step1[Step 1]
    Plan --> Step2[Step 2]
    Plan --> Step3[Step 3]
    
    Step1 --> Execute1[Execute]
    Step2 --> Execute2[Execute]
    Step3 --> Execute3[Execute]
    
    Execute1 --> Result1[Result]
    Execute2 --> Result2[Result]
    Execute3 --> Result3[Result]
    
    Result1 --> Aggregate[Aggregate Results]
    Result2 --> Aggregate
    Result3 --> Aggregate

Plan Components

Component	File	Purpose
`PlanCard`	`PlanCard.tsx`	Displays individual plan summary
`PlanList`	`PlanList.tsx`	Lists all available plans
`PlanView`	`PlanView.tsx`	Interactive plan editor/viewer

Sources: frontend/src/components/features/Plans/PlanCard.tsx:1-100

Message Rendering System

The Agent System communicates results through a structured message rendering system:

graph TD
    Message[Agent Message] --> Parse[Parse Content]
    Parse --> Type{Message Type}
    
    Type -->|Text| TextRender[Text Renderer]
    Type -->|Plan| PlanRender[Plan View]
    Type -->|File| FileRender[File Renderer]
    Type -->|Image| ImageRender[Image Renderer]
    
    TextRender --> Display[UI Display]
    PlanRender --> Display
    FileRender --> Display
    ImageRender --> Display

The RenderMessage component handles the display of agent outputs, supporting:

Multi-modal content rendering
Plan visualization
File previews and downloads
Image galleries

Sources: frontend/src/components/views/chat/rendermessage.tsx:1-100

Browser Control Details

The Web Surfer Agent uses coordinate-based browser control:

Coordinate System

X-axis: Pixels from the left edge of the viewport
Y-axis: Pixels from the top edge of the viewport
Scroll: Positive values scroll up, negative values scroll down

Type Action Parameters

Parameter	Type	Description
`text`	string	Text to type
`coordinate`	[x, y]	Target element position
`press_enter`	boolean	Submit after typing
`delete_existing_text`	boolean	Clear before typing

Run Status States

The agent execution maintains the following status states:

Status	Description
`active`	Agent is currently executing
`paused`	Execution paused, awaiting resume
`pausing`	Pause is in progress
`awaiting_input`	Waiting for user input or approval
`completed`	Task finished successfully
`failed`	Task execution failed

Sources: frontend/src/components/views/chat/chat.tsx:30-60

Configuration Management

The Agent System configuration is managed through the frontend store:

interface IAgentFlowSettings {
  direction: "TB" | "LR";  // Flow chart orientation
  showLabels: boolean;      // Display edge labels
  showGrid: boolean;        // Show background grid
  showTokens: boolean;     // Display token counts
  showMessages: boolean;   // Show message nodes
  showMiniMap: boolean;    // Show navigation minimap
}

Sources: frontend/src/hooks/store.tsx:1-80

MCP Server Integration

The system supports Model Context Protocol (MCP) servers for extended agent capabilities:

SSE-based server connections
Stdio-based server connections
JSON configuration import

MCP servers are configured via the McpConfigModal component and integrated into the agent selection process during task execution.

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-100

Detail Viewer

The DetailViewer component provides real-time visualization of agent activities:

Screenshots Tab: Periodic screenshots of browser state
Live Tab: Real-time browser view via noVNC
Control Mode: Fullscreen overlay for manual intervention

Sources: frontend/src/components/views/chat/detail_viewer.tsx:1-100 frontend/src/components/views/chat/runview.tsx:1-80

Summary

The Agent System in Magentic-UI provides a comprehensive framework for autonomous task execution through:

Specialized Agents: Web Surfer, File Surfer, Coder, User Proxy
Orchestration Layer: Coordinates multi-agent collaboration
Plan System: Breaks tasks into executable steps
Real-time Visualization: Browser screenshots and live view
Human-in-the-Loop: User proxy for approval and input
MCP Integration: Extensible server architecture

This architecture enables complex, multi-step task automation while maintaining user oversight and control throughout the execution process.

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-80

Team Orchestration

Related topics: Agent System, High-Level Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Agent Types

Continue reading this section for the full explanation and source context.

Section Action Types

Continue reading this section for the full explanation and source context.

Related topics: Agent System, High-Level Architecture

Team Orchestration

Overview

Team Orchestration is a core system in Magentic-UI that coordinates multiple AI agents to work together on complex tasks. It provides the infrastructure for orchestrating agent teams, managing communication between agents, and handling task distribution and execution flow.

The orchestration system enables Magentic-UI to:

Coordinate multiple specialized agents (coders, web surfers, planners)
Manage agent collaboration through structured message passing
Handle approval policies for sensitive actions
Support dynamic task execution with planning and reflection capabilities

Architecture

Core Components

Component	File Path	Purpose
Orchestrator	`src/magentic_ui/teams/orchestrator/_orchestrator.py`	Central coordinator for agent teams
Group Chat	`src/magentic_ui/teams/orchestrator/_group_chat.py`	Manages multi-agent message passing
Prompts	`src/magentic_ui/teams/orchestrator/_prompts.py`	Prompt templates for orchestration
Sentinel Prompts	`src/magentic_ui/teams/orchestrator/_sentinel_prompts.py`	Safety and monitoring prompts

Agent Types

Magentic-UI supports several specialized agent types that can be orchestrated:

Agent Type	Purpose	Key Actions
Coder Agent	Execute Python/code tasks	Write, debug, execute code
Web Surfer Agent	Browse and interact with web content	scroll, visit_url, web_search, wait
Planner Agent	Create and manage execution plans	Task decomposition, step planning
MCP Server Agents	External tool integrations	Configurable via SSE/stdio protocols

Web Surfer Agent Actions

The web surfer agent supports a comprehensive set of actions for web interaction:

parameters = {
    "properties": {
        "scroll": {
            "description": "The number of pixels to scroll. Positive scrolls down, negative scrolls up.",
            "type": "number",
        },
        "url": {
            "description": "The URL to visit. Required only by `action=visit_url`.",
            "type": "string",
        },
        "query": {
            "description": "The query to search for. Required only by `action=web_search`.",
            "type": "string",
        },
        "fact": {
            "description": "The fact to remember for the future. Required only by `action=pause_and_memorize_fact`.",
            "type": "string",
        },
        "time": {
            "description": "The seconds to wait. Required only by `action=wait`.",
            "type": "number",
        },
        "status": {
            "description": "The status of the task. Required only by `action=terminate`.",
            "type": "string",
            "enum": ["success", "failure"],
        },
    },
    "required": ["action"],
    "type": "object",
}

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-50

Action Types

Action	Parameters	Description
`scroll`	`scroll` (pixels)	Scrolls the viewport
`visit_url`	`url`	Navigate to a URL
`web_search`	`query`	Search the web
`pause_and_memorize_fact`	`fact`	Store information for context
`wait`	`time` (seconds)	Wait before continuing
`terminate`	`status` (success/failure)	End the task

Function Call Handling

The system uses a specialized function call prompt system for agent communication:

tool_descs = [{"type": "function", "function": f} for f in functions]
tool_names = [
    function.get("name_for_model", function.get("name", ""))
    for function in functions
]
tool_descs = "\n".join([json.dumps(f, ensure_ascii=False) for f in tool_descs])

Sources: src/magentic_ui/agents/web_surfer/fara/qwen_helpers/fncall_prompt.py:1-60

Message Processing Flow

graph TD
    A[User Input] --> B[Orchestrator]
    B --> C{Agent Selection}
    C -->|Planning| D[Planner Agent]
    C -->|Execution| E[Coder Agent]
    C -->|Web Tasks| F[Web Surfer Agent]
    D --> G[Execution Plan]
    E --> H[Code Execution]
    F --> I[Web Actions]
    G --> B
    H --> B
    I --> B
    B --> J[User Response]

Role-Based Message Handling

Role	Processing	Content Format
`ASSISTANT`	Appends tool calls to last message	`<tool_call>` XML blocks
`FUNCTION`	Processes tool responses	`<tool_response>` XML blocks
`USER`	Standard user messages	Plain text or structured

MCP Server Integration

Magentic-UI supports Model Context Protocol (MCP) servers for extended functionality:

interface McpServerConfig {
  serverName: string;
  agentName: string;
  agentDescription: string;
  connectionType: 'sse' | 'stdio' | 'json';
}

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx

MCP Configuration Modes

Mode	Description	Use Case
SSE	Server-Sent Events	Remote server connections
Stdio	Standard I/O	Local process communication
JSON Config	Raw JSON configuration	Advanced users

Agent Description Requirements

Each MCP server requires a description that helps the orchestrator decide when to invoke it:

"Describe how and when this server should be used. This helps the orchestrator decide when to call this agent."

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx:1-120

Code Execution Flow

The coder agent provides secure code execution with output capture:

async def _summarize_coding(
    agent_name: str,
    model_client: ChatCompletionClient,
    thread: Sequence[BaseChatMessage | BaseAgentEvent],
    cancellation_token: CancellationToken,
    model_context: ChatCompletionContext,
) -> TextMessage:

Sources: src/magentic_ui/agents/_coder.py:1-100

Code Execution States

State	Description	Exit Code
Success	Code executed without errors	`0`
Timeout	Execution exceeded time limit	N/A
Error	Runtime exception occurred	Non-zero

# Break if all code executions were successful
if all([code_output == 0 for code_output in exit_code_list]):
    break

CLI Configuration

The orchestration system is configured through the CLI entry point:

def main() -> None:
    """
    Entry point for the magentic-cli command.
    Called from pyproject.toml's [project.scripts] section.
    """
    app()

Sources: src/magentic_ui/_cli.py:1-50

CLI Parameters

Parameter	Type	Purpose
`mcp_agents`	List	External MCP server agents
`run_without_docker`	bool	Run without container isolation
`browser_headless`	bool	Run browser in headless mode
`browser_local`	bool	Use local browser instead of remote
`sentinel_tasks`	List	Background monitoring tasks
`dynamic_sentinel_sleep`	int	Sleep interval for sentinel checks

Approval Policies

The orchestration system supports configurable approval policies for controlling agent actions:

Policy	Behavior
`Auto Approve`	All actions execute automatically
`Manual Approval`	User must approve each action
`Policy Based`	Rules determine approval based on action type

Error Handling

Timeout Handling

except asyncio.TimeoutError:
    executor_msg = TextMessage(
        source=agent_name + "-executor",
        metadata={"internal": "yes"},
        content="Code execution timed out.",
    )
    delta.append(executor_msg)
    yield executor_msg

Summary

Team Orchestration in Magentic-UI provides a flexible framework for coordinating multiple AI agents. Key features include:

Multi-Agent Coordination: Specialized agents work together through the orchestrator
Flexible Communication: Role-based message passing with XML-formatted tool calls
MCP Integration: Extensible architecture through Model Context Protocol servers
Safe Execution: Code execution with timeout handling and error capture
Approval Workflows: Configurable policies for sensitive operations

The system is designed to be modular, allowing new agent types and capabilities to be added through well-defined interfaces.

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py:1-50

Backend API

Related topics: High-Level Architecture, Frontend UI

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Architecture

Continue reading this section for the full explanation and source context.

Section API Router Structure

Continue reading this section for the full explanation and source context.

Section Health Check

Continue reading this section for the full explanation and source context.

Related topics: High-Level Architecture, Frontend UI

Backend API

Overview

The Magentic-UI Backend API is a FastAPI-based REST/WebSocket service that orchestrates multi-agent workflows, manages conversation sessions, and provides real-time execution capabilities for the frontend interface. It serves as the central hub for all backend operations including session management, run execution, agent coordination, and MCP (Model Context Protocol) integration.

The API is accessible at http://localhost:8081/api for local development and expects all frontend requests to be directed to this endpoint. Sources: frontend/README.md

Architecture

High-Level Architecture

graph TB
    subgraph "Frontend Client"
        UI[UI Components]
    end
    
    subgraph "Backend API - FastAPI"
        App[Main Application]
        Routers[Route Handlers]
        Managers[Connection Managers]
    end
    
    subgraph "Data Layer"
        DB[(Database)]
        StaticFiles[Static Files]
    end
    
    UI --> |HTTP/WS| App
    App --> Routers
    Routers --> Managers
    Managers --> DB
    App --> StaticFiles

API Router Structure

The backend organizes its functionality into modular routers, each handling a specific domain:

Router	Prefix	Purpose
Teams Router	`/teams`	Multi-agent team coordination
WebSocket Router	`/ws`	Real-time bidirectional communication
Validation Router	`/validate`	Input validation endpoints
Settings Router	`/settings`	User and system configuration
MCP Router	`/mcp`	Model Context Protocol integration
Sessions Router	`/sessions`	Conversation session management
Runs Router	`/runs`	Execution run tracking and control

Sources: src/magentic_ui/backend/web/app.py

Core Endpoints

Health Check

GET /api/health

Returns the health status of the API service.

Response:

{
  "status": true,
  "message": "Service is healthy"
}

Version Information

GET /api/version

Retrieves the current API version.

Response:

{
  "status": true,
  "message": "Version retrieved successfully",
  "data": {
    "version": "<VERSION_STRING>"
  }
}

Sources: src/magentic_ui/backend/web/app.py

API Configuration

Environment Variables

The frontend must be configured with the correct API URL. Create a .env.development file based on .env.default:

cp .env.default .env.development

The primary configuration variable is GATSBY_API_URL which should be set to http://localhost:8081/api for local development. Sources: frontend/README.md

Static File Serving

The backend mounts two static file directories:

Mount Path	Directory	Purpose
`/files`	`static_root`	File downloads with HTML fallback
`/`	`ui_root`	Frontend UI assets

app.mount(
    "/files",
    StaticFiles(directory=initializer.static_root, html=True),
    name="files",
)
app.mount("/", StaticFiles(directory=initializer.ui_root, html=True), name="ui")

Sources: src/magentic_ui/backend/web/app.py

Error Handling

Internal Server Error Handler

The API includes a global exception handler for 500 errors:

@app.exception_handler(500)
async def internal_error_handler(request: Request, exc: Exception):
    logger.error(f"Internal error: {str(exc)}")
    return {
        "status": False,
        "message": "Internal server error",
        "detail": str(exc) if settings.debug else None
    }

This handler:

Logs the full error details server-side
Returns sanitized error messages to clients (hiding details in production)
Uses settings.debug to control error visibility Sources: src/magentic_ui/backend/web/app.py

WebSocket Communication

The WebSocket router (/ws) enables real-time bidirectional communication between the frontend and backend. This is essential for:

Live agent execution progress updates
Streaming intermediate results
Real-time user input responses during agent runs
Session state synchronization

Agent Actions and Tool Integration

The backend exposes agent capabilities through structured action parameters. Agents support the following action types:

Action	Description	Required Parameters
`key`	Perform keyboard key presses	`keys` (array)
`type`	Type text into input fields	`text`, `press_enter`, `delete_existing_text`
`mouse_move`	Move cursor to coordinates	`coordinate` [x, y]
`left_click`	Click at coordinates	`coordinate` [x, y]
`scroll`	Scroll mouse wheel	`pixels`
`visit_url`	Navigate to URL	`url`
`web_search`	Execute web search	`query`
`history_back`	Go to previous page	-
`pause_and_memorize_fact`	Store information	`fact`
`wait`	Pause execution	`time` (seconds)
`terminate`	End task	`status` (success/failure)

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py

Running the Backend

Prerequisites

Ensure all prerequisites are installed before running the backend. The system requires:

Python environment with dependencies installed
Node.js and npm for frontend development (if building from source)
nvm for Node version management

Starting the Server

magentic-ui --port 8081

The server will:

Initialize the FastAPI application
Mount all routers under /api prefix
Establish database connections
Start listening on the specified port

Development Mode

For frontend development with hot-reloading:

Start frontend in development mode:

cd frontend
npm run start

Run the backend:

magentic-ui --port 8081

The frontend development server runs at http://localhost:8000, while the compiled frontend is available at http://localhost:8081. Sources: frontend/README.md

API Response Format

All API responses follow a consistent format:

{
  "status": true | false,
  "message": "Human-readable status message",
  "data": { ... } | null,
  "detail": "Error details (optional, debug mode only)"
}

This standardization allows the frontend to handle all responses uniformly regardless of which router handled the request.

For troubleshooting and setup issues, refer to the TROUBLESHOOTING.md file in the repository root.

Sources: src/magentic_ui/backend/web/app.py

Frontend UI

Related topics: Backend API, High-Level Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MagenticUILayout

Continue reading this section for the full explanation and source context.

Section SessionManager

Continue reading this section for the full explanation and source context.

Section ChatView

Continue reading this section for the full explanation and source context.

Related topics: Backend API, High-Level Architecture

Frontend UI

Overview

The Frontend UI of magentic-ui is a React-based web application that provides an interactive interface for users to interact with AI agents. The frontend communicates with the backend API at http://localhost:8081/api and enables features such as chat conversations, plan management, MCP server configuration, and real-time task execution visualization.

The UI is built using:

React with TypeScript for component architecture
Ant Design as the primary UI component library
Tailwind CSS for custom styling
React Markdown for rendering markdown content

Sources: frontend/package.json

Architecture Overview

The frontend application follows a component-based architecture with clear separation between layout, views, features, and common components.

graph TD
    A[App Entry] --> B[MagenticUILayout]
    B --> C[SessionManager]
    C --> D[Views]
    C --> E[SubMenus]
    D --> F[ChatView]
    D --> G[RunView]
    D --> H[PlanList]
    E --> I[SessionList]
    E --> J[PlanLibrary]
    F --> K[ChatInput]
    F --> L[RenderMessage]
    F --> M[ProgressBar]
    F --> N[DetailViewer]
    G --> K
    G --> L
    G --> M
    G --> N

Sources: frontend/src/components/layout.tsx

Application Layout

MagenticUILayout

The main layout wrapper component that provides theme configuration and session management context to all child components.

Prop	Type	Description
restricted	boolean	Whether to restrict access to authenticated users only
children	ReactNode	Child components to render within the layout

Key responsibilities:

Applies theme algorithms (dark/light mode) via Ant Design's ConfigProvider
Wraps content in AppContext for global state access
Displays a disclaimer footer: "Magentic-UI can make mistakes. Please monitor its work and intervene if necessary."

Sources: frontend/src/components/layout.tsx:1-100

SessionManager

The central orchestrator component that manages the overall application state including sessions, plans, and navigation between different views.

graph LR
    A[SessionManager] --> B[PlanList]
    A --> C[ChatView]
    A --> D[SessionEditor]
    B --> E[PlanCard]
    C --> F[ChatInput]
    C --> G[MessageList]
    C --> H[RunView]

State management includes:

activeSubMenuItem: Current navigation state
sessions: List of user sessions
currentRun: Active task execution state
selectedMcpServers: Selected MCP server configurations
editingSession: Session being edited

Sources: frontend/src/components/views/manager.tsx

Chat System

ChatView

The primary chat interface where users interact with AI agents through messages and file uploads.

sequenceDiagram
    User->>ChatInput: Enter message
    ChatInput->>ChatView: handleSubmit(query, files, plan)
    ChatView->>Backend: runTask() via API
    Backend-->>ChatView: CurrentRun status
    ChatView->>RunView: Pass run data
    RunView->>MessageList: Display messages
    MessageList->>RenderMessage: Render each message

Key Features:

Message submission with text, files, and attached plans
Real-time run status display (running, paused, awaiting_input, completed)
MCP server selection for task execution
Plan execution control (approve, deny, pause, cancel)
Sample tasks for quick start

Sources: frontend/src/components/views/chat/chat.tsx

ChatInput

A rich text input component supporting multi-line input, file attachments, and plan attachments.

Props:

Prop	Type	Description
onSubmit	Function	Callback when message is submitted
onCancel	Function	Callback to cancel current operation
runStatus	string	Current run status
inputRequest	object	Request for user input
isPlanMessage	boolean	Whether input is for plan response
onPause	Function	Callback to pause execution
onExecutePlan	Function	Callback to execute a plan
enable_upload	boolean	Enable file uploads
selectedMcpServers	array	Selected MCP servers

Features:

File drag-and-drop and paste support
File list display with remove capability
Plan attachment modal for viewing attached plans
Auto-resizing textarea
Submit button with loading state

Sources: frontend/src/components/views/chat/chatinput.tsx

RenderMessage

Component responsible for rendering different types of messages including user messages, AI responses, plans, and file previews.

Rendering Logic:

Checks if message is from user or assistant
Parses content using parseContent utility
Handles multi-modal content (text arrays)
Renders plans via PlanView component
Applies appropriate styling based on message type

// Message type detection based on metadata
if (message?.metadata?.type === "file" && message?.metadata?.files) {
  // File message handling
  const parsedFiles = JSON.parse(message.metadata.files);
}

Sources: frontend/src/components/views/chat/rendermessage.tsx

ProgressBar

Visual indicator for task execution progress with step-by-step status display.

Display States:

Task Completed: Green progress bar at 100% with "Task Completed" text
In Progress: Shows current step (e.g., "Step 2 of 5") with progress bar
Planning: Shows "Planning..." text when plan is being generated

Progress Calculation:

// Completed section width
width: hasFinalAnswer ? "100%" : (currentStep / totalSteps) * 100 + "%"

// Current step indicator position
left: (currentStep / totalSteps) * 100 + "%"
width: (1 / totalSteps) * 100 + "%"

Sources: frontend/src/components/views/chat/progressbar.tsx

RunView

Container component that manages the detail viewer and message display during task execution.

Layout:

Split view with message list on the left
Detail viewer on the right (collapsible/expandable)
Manages image gallery, VNC preview, and input responses

Sources: frontend/src/components/views/chat/runview.tsx

Plan Management

PlanList

Displays the user's saved plans library with search, create, import, and management capabilities.

Features:

Feature	Description
Create	Create a new empty plan
Import	Import plan from JSON file
Search	Filter plans by name
Export	Download plan as JSON
Delete	Remove plan from library

Plan Card Actions:

Run Plan: Create new session with plan loaded
Edit Plan: Modify plan title and steps in modal

Sources: frontend/src/components/features/Plans/PlanList.tsx

PlanCard

Individual plan display card showing plan metadata and quick actions.

Displayed Information:

Plan title (truncated to 40 characters)
Step count summary (showing first 3 steps)
Creation timestamp with relative time formatting
Hover actions for export and delete

Modal Editing:

Editable plan title
Full plan step editor via PlanView component
Save and cancel functionality

Sources: frontend/src/components/features/Plans/PlanCard.tsx

LearnPlanButton

Button component that allows users to extract and save a reusable plan from the current conversation.

States:

State	Appearance
Disabled	Opacity 50%, cursor not-allowed
Learning	Spinner with "Learning Plan..." text
Ready	Normal button with "Learn Plan" label

Behavior:

Disabled when no sessionId or effectiveUserId
Triggers plan extraction from conversation history
Saves extracted plan to user's plan library

Sources: frontend/src/components/features/Plans/LearnPlanButton.tsx

File Handling

RenderFile

Component for displaying and managing file attachments in messages.

Features:

Detects file type from metadata
Renders appropriate preview based on file type
Provides download functionality
Supports modal view for detailed file inspection

File Type Detection:

if (message?.metadata?.type === "file" && message?.metadata?.files) {
  const parsedFiles = JSON.parse(message.metadata.files);
  // Process files to ensure correct type detection
}

FileCard

Displays individual file with icon, name, and download button.

Interactions:

Click to open file in modal
Hover to show download button
Drag-and-drop zone support

Sources: frontend/src/components/common/filerenderer.tsx

Markdown Rendering

MarkdownRender

Component for rendering markdown content with syntax highlighting and GitHub Flavored Markdown support.

Features:

Syntax highlighting via language detection from file extensions
Configurable text truncation
Indentation indicator support
Dark/light mode compatible styling

Configuration:

Option	Type	Description
truncate	boolean	Enable content truncation
maxLength	number	Maximum character length
indented	boolean	Show indentation indicator
isFilePreview	boolean	Wrap in code block

Sources: frontend/src/components/common/markdownrender.tsx

MCP Server Configuration

McpServerCard

Card component for displaying MCP (Model Context Protocol) server configurations.

Displayed Information:

Server name
Agent description (truncated to 2 lines)
Availability status

Actions:

Action	Description
Edit	Modify server configuration
Remove	Delete server from configuration

Sources: frontend/src/components/features/McpServersConfig/McpServerCard.tsx

Relevant Plans

RelevantPlans

Component for displaying plans relevant to the current conversation context.

Features:

Shows top 3 most relevant plans based on current query
Plan attachment to query
Play action to attach and run plan

Sources: frontend/src/components/views/chat/relevant_plans.tsx

State Management

The frontend uses React Context for global state management:

graph TD
    A[AppContext] --> B[User State]
    A --> C[Session State]
    A --> D[Theme State]
    A --> E[Provider State]
    
    B --> F[userId]
    B --> G[username]
    
    C --> H[sessions]
    C --> I[currentRun]
    C --> J[plans]
    
    E --> K[mcpServers]
    E --> L[selectedMcpServers]

Provider Hook

Custom hooks for accessing and manipulating application state:

Hook	Purpose
useAppContext	Access global app context
useSessions	Manage session list and operations
usePlans	Manage saved plans
useMcpServers	Manage MCP server configurations

Sources: frontend/src/hooks/provider.tsx

API Integration

The frontend communicates with the backend API at http://localhost:8081/api.

Environment Configuration:

# In .env.development
GATSBY_API_URL=http://localhost:8081/api

Sources: frontend/README.md

Key API Operations

Operation	Description
runTask	Start a new task execution
handleInputResponse	Respond to input requests
handlePause	Pause current execution
handleCancel	Cancel running task
handleApprove	Approve pending action
handleDeny	Deny pending action
handleAcceptPlan	Accept proposed plan
handleRegeneratePlan	Regenerate plan suggestions

Component Hierarchy Summary

graph TD
    Root[MagenticUILayout] --> SessionManager
    SessionManager --> Header
    SessionManager --> Sidebar
    SessionManager --> Content
    
    Sidebar --> PlanList
    Sidebar --> SessionList
    
    Content --> ChatView
    Content --> RunView
    
    ChatView --> ChatInput
    ChatView --> MessageList
    ChatView --> RelevantPlans
    
    MessageList --> RenderMessage
    RenderMessage --> PlanView
    RenderMessage --> RenderFile
    RenderMessage --> MarkdownRender
    
    ChatInput --> FileList
    ChatInput --> PlanModal
    
    RunView --> DetailViewer
    RunView --> ProgressBar

Development Guidelines

Adding New Routes

To add a new route (e.g., /about):

Create folder src/pages/about
Add index.tsx file
Follow content style from src/pages/index.tsx
Place core logic in src/components folder

Key Dependencies

Package	Version	Purpose
react	^18.x	UI framework
antd	^5.x	Component library
@ant-design/icons	^5.x	Icon library
react-markdown	^9.x	Markdown rendering
remark-gfm	^4.x	GitHub Flavored Markdown

Sources: frontend/package.json

Browser Automation

Related topics: Agent System, Docker Containers

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Base Architecture

Continue reading this section for the full explanation and source context.

Section LocalPlaywrightBrowser

Continue reading this section for the full explanation and source context.

Section HeadlessDockerPlaywrightBrowser

Continue reading this section for the full explanation and source context.

Related topics: Agent System, Docker Containers

Browser Automation

Overview

The Browser Automation system in Magentic-UI provides a comprehensive framework for controlling web browsers through programmatic interactions. Built on top of Playwright, this system enables AI agents to navigate websites, interact with UI elements, extract content, and perform complex browsing tasks autonomously.

The system supports multiple browser deployment modes including local execution, headless Docker containers, and VNC-enabled Docker containers for visual debugging. This flexibility allows the system to operate in various environments from development machines to cloud deployments.

Architecture Overview

graph TD
    A[WebSurfer Agent] --> B[PlaywrightController]
    B --> C[Browser Implementations]
    C --> D[LocalPlaywrightBrowser]
    C --> E[HeadlessDockerPlaywrightBrowser]
    C --> F[VncDockerPlaywrightBrowser]
    G[PlaywrightState] --> B
    H[SetOfMark] --> B
    I[Playwright API] --> C

Browser Implementations

The system implements a base PlaywrightBrowser class with three specialized implementations:

Base Architecture

All browser implementations inherit from PlaywrightBrowser which provides the core interface for browser operations. This design pattern allows for consistent API usage across different deployment scenarios.

LocalPlaywrightBrowser

The LocalPlaywrightBrowser provides direct browser control on the local machine. This implementation offers:

Full browser lifecycle management (launch, close, context management)
Synchronous and asynchronous operation support
Download folder configuration
Viewport customization
Screenshot capture capabilities

Key Features:

Direct Playwright API access without containerization overhead
Ideal for development and testing environments
Supports all Playwright browser types (Chromium, Firefox, WebKit)

Sources: src/magentic_ui/tools/playwright/browser/local_playwright_browser.py

HeadlessDockerPlaywrightBrowser

The HeadlessDockerPlaywrightBrowser runs browsers inside headless Docker containers. This approach provides:

Isolated browser execution environment
Consistent behavior across different host systems
No visual rendering overhead
Enhanced security through containerization

Docker Integration:

Automatic container image pulling on first use
Graceful container lifecycle management
Resource-efficient headless operation

Sources: src/magentic_ui/tools/playwright/browser/headless_docker_playwright_browser.py

VncDockerPlaywrightBrowser

The VncDockerPlaywrightBrowser extends headless Docker support with VNC connectivity, enabling:

Real-time visual browser observation
Interactive debugging capabilities
NoVNC support for browser-based VNC access
Remote control handover to human operators

Port Configuration:

Parameter	Description	Default
`port`	Main VNC port for container communication	5900
`novnc_port`	WebSocket port for noVNC browser access	6080

Sources: src/magentic_ui/tools/playwright/browser/vnc_docker_playwright_browser.py

PlaywrightController

The PlaywrightController serves as the central orchestrator for browser interactions. It abstracts the complexities of browser automation into a clean, agent-friendly interface.

Core Responsibilities

Page Navigation and Content Extraction:

Visit URLs with configurable timeouts
Extract visible text content from pages
Capture full-page or viewport screenshots
Analyze DOM structure for interactive elements

User Interaction Simulation:

Mouse movements to specific coordinates
Left-click and hover actions
Text input via keyboard typing
Keyboard shortcuts and key presses
Scroll operations with configurable pixels

Tab Management:

Create new browser tabs
Switch between existing tabs
Close tabs
Refresh page content

Action Schema

The controller defines a structured JSON schema for all available actions:

Action	Parameters	Description
`visit_url`	`url`	Navigate to specified URL
`web_search`	`query`	Execute web search
`type`	`text`, `coordinate`, `press_enter`, `delete_existing_text`	Type text or interact with elements
`key`	`keys`	Press keyboard keys
`mouse_move`	`coordinate`	Move mouse cursor
`left_click`	`coordinate`	Click at coordinate
`hover`	`coordinate`	Hover over element
`scroll`	`pixels`	Scroll page (positive=up, negative=down)
`select_option`	`element`, `value`	Select dropdown option
`create_tab`	`url`	Open new tab
`switch_tab`	`tab_id`	Switch to specific tab
`refresh_page`	-	Reload current page
`history_back`	-	Navigate browser history back
`sleep`	`time`	Wait specified seconds
`stop_action`	-	Stop current action sequence

Sources: src/magentic_ui/tools/playwright/playwright_controller.py

State Management

PlaywrightState

The PlaywrightState module handles serialization and persistence of browser session state.

State Components:

Component	Description
`BrowserState`	Complete snapshot of browser context
`save_browser_state()`	Serialize current state to storage
`load_browser_state()`	Restore state from storage

State Data Structure:

Current page URL and title
Tab information and active tab ID
Scroll position
Cookie and local storage data
Form input values
Screenshot history

This enables:

Session recovery after interruptions
Parallel agent execution with shared state
Checkpoint creation for long-running tasks

Sources: src/magentic_ui/tools/playwright/playwright_state.py

Interactive Element Marking

Set of Mark (_set_of_mark)

The _set_of_mark module enhances web pages with visual markers that identify interactive elements. This is crucial for LLM-based agents to accurately identify and target UI elements.

Marking Strategy:

Assigns unique numeric identifiers to interactive elements
Overlays clickable numbers on buttons, links, inputs
Uses sequential numbering for easy reference
Provides coordinate mappings for action targeting

Benefits:

Enables precise element targeting by AI agents
Reduces ambiguity in element selection
Supports visual debugging and verification
Works across different page layouts and frameworks

Sources: src/magentic_ui/tools/playwright/_set_of_mark.py

WebSurfer Agent

The WebSurfer agent is the primary consumer of the browser automation system. It combines the browser implementations with an LLM to make intelligent browsing decisions.

Agent Capabilities

Autonomous Navigation:

Follow links and navigate between pages
Complete multi-step web forms
Search the web and process results
Extract structured information from pages

Content Processing:

Optical Character Recognition (OCR) for image content
Visual question answering on screenshots
Markdown rendering of page content
File download handling

Interaction Modes:

Automatic execution with configurable action limits
Step-by-step mode with human approval
Control handover for human intervention
Pause and resume capabilities

Configuration Parameters

Parameter	Type	Default	Description
`start_page`	str	Google	Initial page on browser launch
`animate_actions`	bool	False	Enable action animation
`save_screenshots`	bool	False	Persist screenshots to disk
`max_actions_per_step`	int	5	Maximum actions per reasoning step
`resize_viewport`	bool	True	Auto-resize viewport
`url_statuses`	dict	None	URL allow/reject rules
`single_tab_mode`	bool	False	Restrict to single tab

Sources: src/magentic_ui/agents/web_surfer/_web_surfer.py

Usage Patterns

Local Browser Usage

from magentic_ui.tools.playwright.browser import LocalPlaywrightBrowser

browser = LocalPlaywrightBrowser(
    headless=False,
    downloads_folder="./downloads"
)
await browser.start()

Docker-based Browser with VNC

from magentic_ui.tools.playwright.browser import VncDockerPlaywrightBrowser

browser = VncDockerPlaywrightBrowser(
    port=5900,
    novnc_port=8080
)
await browser.start()
# Access via browser at http://localhost:8080/vnc.html

Controller Integration

from magentic_ui.tools.playwright.playwright_controller import PlaywrightController

controller = PlaywrightController(browser)
await controller.async_setup()

# Execute actions
result = await controller(
    {"action": "visit_url", "url": "https://example.com"}
)

Workflow Diagram

sequenceDiagram
    participant Agent
    participant Controller
    participant Browser
    participant Page
    
    Agent->>Controller: Execute action
    Controller->>Controller: Validate parameters
    Controller->>Browser: Get page instance
    Browser->>Page: Perform action
    Page-->>Browser: Action result
    Browser-->>Controller: Browser response
    Controller->>Controller: Process result
    Controller->>Agent: Action result
    
    Note over Agent,Page: Repeat until task complete

Frontend Integration

The browser automation system integrates with the Magentic-UI frontend through:

DetailViewer Component:

Real-time browser view in the UI
Screenshot gallery display
Live action feed
Control mode overlay for human takeover

Modal Components:

BrowserModal for full-screen viewing
VNC connection handling
Pause and resume controls

Sources: frontend/src/components/views/chat/detail_viewer.tsx

Security Considerations

URL Filtering:

UrlStatusManager validates navigation targets
Configurable allow/reject lists
Prevents navigation to untrusted domains

Sandbox Isolation:

Docker containers provide process isolation
Restricted network access when needed
Resource limits prevent runaway processes

Approval Workflows:

Human approval for sensitive actions
Configurable approval thresholds
Audit logging of all actions

Conclusion

The Browser Automation system provides a robust, flexible foundation for web interaction in Magentic-UI. By combining Playwright's powerful browser control with thoughtful abstractions and multiple deployment options, it enables AI agents to perform complex web-based tasks reliably and safely.

Sources: src/magentic_ui/tools/playwright/browser/local_playwright_browser.py

Docker Containers

Related topics: Getting Started with Magentic-UI, Browser Automation

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Browser Docker Container

Continue reading this section for the full explanation and source context.

Section Python Environment Docker Container

Continue reading this section for the full explanation and source context.

Section Service Configuration

Continue reading this section for the full explanation and source context.

Docker Containers

Magentic-UI leverages Docker containers to provide isolated, reproducible environments for running browser automation and code execution tasks. This architecture enables the application to execute complex multi-agent workflows while maintaining system-level isolation and consistent runtime dependencies.

Architecture Overview

Magentic-UI uses two primary Docker images working in tandem to deliver its functionality:

graph TB
    subgraph "Magentic-UI Architecture"
        A["Frontend UI<br/>(localhost:8081)"] --> B["Backend API<br/>(Python/FastAPI)"]
        B --> C["Browser Container<br/>(VNC + Playwright)"]
        B --> D["Python Environment Container<br/>(Code Execution)"]
    end
    
    subgraph "Container Communication"
        C <-->|"WebSocket/REST"| B
        D <-->|"STDIO/REST"| B
    end

Docker Image Types

Image Type	Purpose	Key Components
`magentic-ui-browser`	Browser automation and web interaction	VNC Server, noVNC, Playwright, Chromium
`magentic-ui-python-env`	Safe Python code execution	Python runtime, isolated environment

Sources: docker/build-all.sh:1-20

Browser Docker Container

The browser container provides a full graphical environment for web surfing agents. It includes:

VNC Server: Provides virtual display access
noVNC: Web-based VNC client for browser access
Playwright: Browser automation framework for programmatic control
Chromium: Headless-capable web browser

Sources: docker/magentic-ui-browser-docker/Dockerfile

Python Environment Docker Container

The Python environment container provides a sandboxed environment for executing user-generated Python code safely:

Isolated Python runtime
Restricted file system access
Controlled network access
Independent package management

Sources: docker/magentic-ui-python-env/Dockerfile

Docker Initialization Workflow

sequenceDiagram
    participant User
    participant CLI
    participant Docker Daemon
    participant Registry
    
    User->>CLI: magentic-ui --port 8081
    CLI->>Docker Daemon: Check if Docker is running
    Docker Daemon-->>CLI: Docker Status
    
    alt Docker not running
        CLI->>User: Error: Please start Docker
    else Docker running
        CLI->>Docker Daemon: Check browser image exists
        Docker Daemon-->>CLI: Image Status
        
        alt Image missing
            CLI->>Registry: Pull browser image
            Registry-->>Docker Daemon: Image layers
            Docker Daemon->>Docker Daemon: Build image
        end
        
        CLI->>Docker Daemon: Check Python image exists
        Docker Daemon-->>CLI: Image Status
        
        alt Image missing
            CLI->>Registry: Pull Python image
            Registry-->>Docker Daemon: Image layers
            Docker Daemon->>Docker Daemon: Build image
        end
        
        CLI->>User: Magentic-UI ready
    end

Sources: src/magentic_ui/_docker.py

Container Management Functions

The src/magentic_ui/_docker.py module provides core Docker management functionality:

Function	Purpose
`check_docker_running()`	Verifies Docker daemon is accessible
`check_browser_image()`	Checks if browser Docker image exists locally
`check_python_image()`	Checks if Python environment Docker image exists locally
`pull_browser_image()`	Pulls/updates the browser Docker image
`pull_python_image()`	Pulls/updates the Python environment Docker image

Sources: src/magentic_ui/_docker.py

Build Process

The build script docker/build-all.sh constructs both Docker images:

# Build browser Docker image
docker build -t magentic-ui-browser ./magentic-ui-browser-docker

# Build Python environment Docker image
docker build -t magentic-ui-python-env ./magentic-ui-python-env

Sources: docker/build-all.sh

Browser Container Services

The browser container runs multiple services managed by supervisord:

graph LR
    subgraph "Browser Container Services"
        A["supervisord<br/>(Process Manager)"]
        A --> B["Xvfb<br/>(Virtual Display)"]
        A --> C["x11vnc<br/>(VNC Server)"]
        A --> D["noVNC<br/>(Web VNC)"]
        A --> E["Playwright<br/>(Browser Control)"]
    end

Service Configuration

The browser container uses supervisord.conf for service orchestration:

Process Management: Supervisord manages all background services
Auto-restart: Services automatically restart on failure
Log Management: Centralized logging configuration

Sources: docker/magentic-ui-browser-docker/supervisord.conf

Playwright Server

The Playwright server (playwright-server.js) provides HTTP API access to browser automation:

// Server initialization with browser configuration
// Handles browser launching, page creation, and element interaction

Sources: docker/magentic-ui-browser-docker/playwright-server.js

Running Without Docker

For environments where Docker is unavailable, Magentic-UI supports a limited mode:

magentic-ui --run-without-docker --port 8081

Limitations in No-Docker Mode:

Feature	With Docker	Without Docker
Web Surfing	Full browser automation	Not available
Code Execution	Isolated sandbox	Not available
File Handling	Enhanced isolation	Basic support
Agent Capabilities	Complete	Reduced

Sources: README.md

Prerequisites

System Requirements

Requirement	Minimum	Recommended
Docker Version	Latest stable	Latest stable
Python	3.10+	3.11+
RAM	4GB	8GB+
Disk Space	2GB	5GB+

Platform Support

Linux: Full support with native Docker
macOS: Full support with Docker Desktop
Windows: WSL2 required for Docker support

Sources: TROUBLESHOOTING.md

Troubleshooting

Common Docker Issues

Issue	Symptom	Solution
Docker not running	"Docker is not running" error	Start Docker Desktop/daemon
Image pull failure	Timeout during first run	Run `docker/build-all.sh` manually
Port conflict	Container fails to start	Change port with `--port` flag

Verification Commands

# Check Docker is running
docker info

# Verify images exist
docker images | grep magentic-ui

# Manually build images
cd docker && sh build-all.sh

Sources: TROUBLESHOOTING.md

Configuration

Environment Variables

Variable	Purpose	Default
`NOVNC_PORT`	noVNC web interface port	6080
`PLAYWRIGHT_PORT`	Playwright API port	8080
`PYTHON_ENV_PORT`	Python execution port	8082

Workspace Configuration

The CLI manages workspace paths passed to containers:

workspace_config = {
    "internal_workspace_root": "/path/to/internal",
    "external_workspace_root": "/path/to/external",
    "inside_docker": True,
    "config": {...},
    "run_without_docker": False
}

Sources: src/magentic_ui/backend/cli.py

Security Considerations

Container Isolation

Network Isolation: Containers communicate via internal bridge network
File System Isolation: Read-only base images with volume mounts for data
Process Isolation: Separate PID namespaces

Best Practices

Always run Docker with non-root user when possible
Keep Docker images updated with latest security patches
Use the provided workspace paths for file operations
Monitor container resource usage

Sources: docker/build-all.sh:1-20

Configuration

Related topics: Getting Started with Magentic-UI

Section Related Pages

Continue reading this section for the full explanation and source context.

Section CLI Entry Point

Continue reading this section for the full explanation and source context.

Section Web Server Configuration

Continue reading this section for the full explanation and source context.

Section Team Manager Configuration

Continue reading this section for the full explanation and source context.

Related topics: Getting Started with Magentic-UI

Configuration

Magentic-UI provides a multi-layered configuration system that spans both the backend (Python) and frontend (React/TypeScript) layers. The system handles environment-based settings, agent parameters, UI theming, and server configurations through YAML files, environment variables, and component-level props.

Overview

The configuration architecture in Magentic-UI can be visualized as follows:

graph TD
    A[Configuration Sources] --> B[Backend CLI]
    A --> C[Environment Variables]
    A --> D[YAML Config Files]
    A --> E[Frontend React Components]
    
    B --> F[Server Initialization]
    C --> G[API URL Configuration]
    D --> H[Agent Parameters]
    E --> I[UI Theme & Modal Settings]
    
    F --> J[Backend Server Running on Port 8081]
    G --> K[Frontend Dev Server]
    H --> L[Web Surfer Agent]
    I --> M[User Interface]

Backend Configuration

CLI Entry Point

The main CLI entry point in src/magentic_ui/_cli.py serves as the primary configuration bootstrap for the backend server. It handles argument parsing and delegates to the backend CLI module.

Key configuration parameters supported:

Parameter	Type	Description
`--host`	string	Server host address
`--port`	integer	Server port number (default: 8081)
`--config`	string	Path to YAML configuration file
`--debug`	boolean	Enable debug mode

Sources: src/magentic_ui/_cli.py

Web Server Configuration

The web server configuration module (src/magentic_ui/backend/web/config.py) defines the core server settings used by the FastAPI-based backend.

class ServerConfig:
    host: str = "0.0.0.0"
    port: int = 8081
    cors_origins: list[str] = ["http://localhost:8000"]
    debug: bool = False

Configuration Options:

Option	Default	Description
`host`	`"0.0.0.0"`	Bind address for the server
`port`	`8081`	HTTP port for the backend API
`cors_origins`	`["http://localhost:8000"]`	Allowed CORS origins
`debug`	`false`	Enable verbose logging and hot reload

Sources: src/magentic_ui/backend/web/config.py

Team Manager Configuration

The teammanager.py module handles multi-agent orchestration configuration. It manages agent teams and their coordination settings.

Key configuration aspects:

Agent pool sizing
Maximum concurrent agents
Communication protocols between agents
Timeout settings for agent operations

Sources: src/magentic_ui/backend/teammanager/teammanager.py

YAML Configuration Files

fara_config.yaml

The fara_config.yaml file contains configuration for the web surfer agent, including display settings and browser automation parameters.

display_width_px: 1280
display_height_px: 720
include_input_text_key_args: false

Web Surfer Agent Parameters:

Parameter	Type	Description
`display_width_px`	integer	Browser viewport width in pixels
`display_height_px`	integer	Browser viewport height in pixels
`include_input_text_key_args`	boolean	Include text input keyboard shortcuts

Sources: fara_config.yaml

Agent Parameter Handling

The _prompts.py module in the web surfer agent demonstrates how configuration is consumed:

def __init__(self, cfg=None):
    self.display_width_px = cfg["display_width_px"]
    self.display_height_px = cfg["display_height_px"]
    include_input_text_key_args = cfg.pop("include_input_text_key_args", False)
    if not include_input_text_key_args:
        self.parameters["properties"].pop("press_enter", None)
        self.parameters["properties"].pop("delete_existing_text", None)
    super().__init__(cfg)

Sources: src/magentic_ui/agents/web_surfer/fara/_prompts.py

Frontend Configuration

Environment Variables

The frontend uses environment variables configured through a .env file structure. The development environment requires specific settings to connect to the backend API.

Setup Instructions:

Copy .env.default to .env.development
Set the required variables in the new file

Variable	Required Value	Description
`GATSBY_API_URL`	`http://localhost:8081/api`	Backend API endpoint
`GATSBY_WS_URL`	`ws://localhost:8081/ws`	WebSocket endpoint (if applicable)

Sources: frontend/README.md

API Configuration Flow

sequenceDiagram
    participant FE as Frontend (React)
    participant API as Backend API
    participant WS as WebSocket
    
    FE->>FE: Load .env.development
    FE->>API: HTTP requests to GATSBY_API_URL
    FE->>WS: WebSocket connections
    API-->>FE: JSON responses
    WS-->>FE: Real-time updates

MCP Server Configuration

The MCP (Model Context Protocol) server configuration modal provides a UI for managing external server integrations.

Supported Connection Types:

Type	Description	Configuration Method
`SSE`	Server-Sent Events	Form-based input
`Stdio`	Standard I/O	Form-based input
`JSON`	Raw JSON Config	Direct JSON editing

Server Configuration Validation:

Field	Validation Rule
`serverName`	Required, alphanumeric characters only, max 50 characters
`serverName`	Must be unique across all servers

// Example validation logic from McpConfigModal.tsx
const serverNameError = !serverName || !/^[a-zA-Z0-9]+$/.test(serverName);
const serverNameDuplicateError = existingServers.some(
  (s) => s.name === serverName && s.id !== server?.id
);

Sources: frontend/src/components/features/McpServersConfig/McpConfigModal.tsx

UI Theme Configuration

Theme Application

The main layout component applies theme settings based on user preferences and system defaults:

<ConfigProvider
  theme={{
    algorithm: darkMode === "dark" 
      ? theme.darkAlgorithm 
      : theme.defaultAlgorithm,
  }}
>

Theme Options:

Mode	Algorithm	CSS Classes
Light	`defaultAlgorithm`	`bg-white`, `text-gray-900`
Dark	`darkAlgorithm`	`bg-gray-900`, `text-gray-100`

Sources: frontend/src/components/layout.tsx

Component-Level Styling

Components use Tailwind CSS utility classes for configuration of:

Color schemes (bg-magenta-800, text-blue-400)
Spacing (p-3, mt-4, mb-2)
Typography (text-sm, font-medium)
Transitions (transition-colors, transition-all duration-300)

Configuration Workflow

graph LR
    A[Start Application] --> B{Backend or Frontend?}
    
    B -->|Backend| C[Load CLI Args]
    B -->|Backend| D[Parse YAML Config]
    B -->|Backend| E[Initialize Server]
    
    B -->|Frontend| F[Load .env.development]
    B -->|Frontend| G[Build API URL]
    B -->|Frontend| H[Render UI Components]
    
    C --> E
    D --> E
    F --> G
    G --> H
    E --> I[Server Ready]
    H --> J[User Interface Ready]

Configuration Files Summary

File Path	Purpose	Format
`src/magentic_ui/_cli.py`	Main CLI entry point	Python
`src/magentic_ui/backend/cli.py`	Backend CLI logic	Python
`src/magentic_ui/backend/web/config.py`	Web server settings	Python (dataclass)
`src/magentic_ui/backend/teammanager/teammanager.py`	Agent orchestration	Python
`fara_config.yaml`	Web surfer agent settings	YAML
`.env.development`	Frontend environment	Environment Variables

Best Practices

Environment Isolation: Keep development and production environment files separate
Validation: Always validate MCP server names against alphanumeric patterns
CORS Settings: Ensure backend CORS configuration matches frontend origin
Port Consistency: The frontend expects the backend at http://localhost:8081/api
Theme Persistence: User theme preferences should be stored in local storage or user profile

Sources: src/magentic_ui/_cli.py

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Create tutorials and documentation for the codebase

First-time setup may fail or require extra isolation and rollback planning.

medium Support Podman in place of Docker

First-time setup may fail or require extra isolation and rollback planning.

medium magentic-ui can't display all the html element

First-time setup may fail or require extra isolation and rollback planning.

medium Refreshing or restart the web app will make the current Session unavailable

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

Doramagic Pitfall Log

Doramagic extracted 13 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: Create tutorials and documentation for the codebase

Severity: medium
Finding: Installation risk is backed by a source signal: Create tutorials and documentation for the codebase. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/154

2. Installation risk: Support Podman in place of Docker

Severity: medium
Finding: Installation risk is backed by a source signal: Support Podman in place of Docker. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/312

3. Installation risk: magentic-ui can't display all the html element

Severity: medium
Finding: Installation risk is backed by a source signal: magentic-ui can't display all the html element. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/362

4. Configuration risk: Refreshing or restart the web app will make the current Session unavailable

Severity: medium
Finding: Configuration risk is backed by a source signal: Refreshing or restart the web app will make the current Session unavailable. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/336

5. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:978331188 | https://github.com/microsoft/magentic-ui | README/documentation is current enough for a first validation pass.

6. Project risk: Why not conduct a requirement analysis before the plan?

Severity: medium
Finding: Project risk is backed by a source signal: Why not conduct a requirement analysis before the plan?. Treat it as a review item until the current version is checked.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/321

7. Maintenance risk: Sticked at click the “Shopping Cart” icon and cannot goto check out page

Severity: medium
Finding: Maintenance risk is backed by a source signal: Sticked at click the “Shopping Cart” icon and cannot goto check out page. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/360

8. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:978331188 | https://github.com/microsoft/magentic-ui | last_activity_observed missing

9. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:978331188 | https://github.com/microsoft/magentic-ui | no_demo; severity=medium

10. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:978331188 | https://github.com/microsoft/magentic-ui | no_demo; severity=medium

11. Security or permission risk: Settings redesign

Severity: medium
Finding: Security or permission risk is backed by a source signal: Settings redesign. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/microsoft/magentic-ui/issues/227

12. Maintenance risk: issue_or_pr_quality=unknown

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:978331188 | https://github.com/microsoft/magentic-ui | issue_or_pr_quality=unknown

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using Magentic-UI Capability Pack with real data or production workflows.

Create tutorials and documentation for the codebase - github / github_issue
Settings redesign - github / github_issue
Support Podman in place of Docker - github / github_issue
Why not conduct a requirement analysis before the plan? - github / github_issue
Refreshing or restart the web app will make the current Session unavaila - github / github_issue
Sticked at click the “Shopping Cart” icon and cannot goto check out page - github / github_issue
magentic-ui can't display all the html element - github / github_issue
Magentic-UI 0.1.5: "Tell Me When" tasks enabled by SentinelSteps - github / github_release
Magentic-UI 0.1.2 - github / github_release
Magentic-UI 0.1.1 - github / github_release
Magentic-UI 0.1.0 - github / github_release
Magentic-UI v0.0.6 - github / github_release

Source: Project Pack community evidence and pitfall evidence