gpt-researcher Manual Preview

Doramagic Project Pack · Human Manual

gpt-researcher

An autonomous agent that conducts deep research on any data using any LLM providers

Overview and Core Architecture

Related topics: Research Pipeline: Retrievers, Scrapers, Context, and Deep Research, Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers, Backend Server, Fro...

Section Related Pages

Continue reading this section for the full explanation and source context.

Overview and Core Architecture

Purpose and Scope

GPT Researcher is an autonomous agent that performs deep, multi-source research and produces long-form reports (typically 5–6 pages) in Markdown, PDF, and DOCX formats. As described in README.md, it works by "creating a task-specific agent based on a research query," generating objective questions, crawling trusted sources, summarizing them with citations, and aggregating results into a final report.

The project ships several execution surfaces sharing the same underlying engine:

Surface	Entry Point	Source
Python library	`GPTResearcher` class	README.md
REST/WebSocket backend	FastAPI server (`/ws`)	README.md
Lightweight static frontend	FastAPI-served HTML/CSS/JS	frontend/README.md
NextJS frontend	Node.js app on port 3000	frontend/README.md
MCP server	`gptr-mcp` (Claude integration)	mcp-server/README.md
Multi-agent orchestrations	LangGraph & AG2 workflows	multi_agents/README.md

High-Level Architecture

The core research loop is split into two phases: planning/scraping (a single GPTResearcher instance) and writing, optionally followed by a multi-agent review cycle for long-form outputs.

flowchart LR
    User[User Query] --> WS[/ws WebSocket/]
    WS --> GR[GPTResearcher Agent]
    GR -->|plan| Planner[Question Generator]
    GR -->|crawl| Scrapers[Scrapers / Retriever]
    Scrapers --> Ctx[(Context Store)]
    Ctx --> Writer[write_report]
    Writer -->|single-agent| Out[Markdown / PDF / DOCX]
    Writer -->|multi-agent| MA[LangGraph / AG2 Workflow]
    MA --> Editor --> Researcher --> Reviewer --> Reviser --> Publisher
    Publisher --> Out

The single-agent path lives under backend/ and uses self.context to hold aggregated research. Community issue #1572 reports that when self.context = [] is empty, write_report may still emit confident-looking but fabricated sources, so callers should guard against empty contexts before invoking the writer.

The deep-research variant extends the loop recursively, as shown in backend/report_type/deep_research/example.py, where deep_research() calls generate_serp_queries() and iterates with configurable breadth and depth, accumulating learnings, citations, and visited_urls across branches.

Multi-Agent Workflow

For higher-quality, longer outputs the project delegates planning, drafting, review, and publishing to a graph of cooperating agents. Per multi_agents/README.md, the pipeline is:

Browser — runs an initial GPT Researcher pass to gather raw research.
Editor — plans an outline; the prompt in multi_agents/agents/editor.py requests a JSON of title and sections based on the initial research summary.
Researcher → Reviewer → Reviser — executed in parallel per outline section. The ReviewerAgent.review_draft() returns revision notes or None once a draft is acceptable.
Writer — compiles an introduction, table of contents, conclusion, and an APA-formatted sources list using the schema in multi_agents/agents/writer.py.
Publisher — emits the report to PDF/DOCX/Markdown formats.

Shared draft state across these nodes is defined in backend/memory/draft.py as a DraftState TypedDict carrying task, topic, draft, review, and revision_notes.

An alternative AG2-based implementation is provided under multi_agents_ag2/, which mirrors the same Editor/Researcher/Reviewer/Reviser roles (see multi_agents_ag2/README.md and multi_agents_ag2/agents/editor.py).

Configuration, Frontend, and Deployment

Behavior is driven by environment variables and a task.json (for the multi-agent CLI). The task schema includes query, model, max_sections, max_plan_revisions, source (web or local, with DOC_PATH for local files), follow_guidelines, guidelines, and verbose — documented in multi_agents_ag2/README.md.

Two frontends are supported (frontend/README.md):

Static — uvicorn main:app on port 8000; no Node toolchain required.
NextJS — npm run dev on port 3000, paired with the FastAPI backend.

Docker Compose brings both up by default; recent releases (v3.4.2–v3.5.0) added ModelsLab image generation, o3-mini reasoning support, fixed multi-agent run_research_task NameErrors, and PyMuPDF page iteration.

Security and Reliability Notes

Community-reported issues that affect how the architecture should be deployed:

SSRF via /ws (#1794) — the WebSocket accepts a source_urls list with no auth or URL validation; an unauthenticated network attacker can probe internal addresses. Operate the backend behind a trusted boundary or filter URLs.
Arbitrary local PDF read (#1805) — PyMuPDFScraper treats any .pdf entry in source_urls as a local path when it is not a URL, enabling local file disclosure. Disable the PyMuPDFScraper non-URL branch or restrict source_urls to verified origins in production.
Hallucinated sources on empty context (#1572) — defensively reject empty research contexts before calling write_report.
Docs site breakage (#1807) — the marketing site may currently render a client-side exception on certain anchors; use the GitHub README for canonical installation steps.

Research Pipeline: Retrievers, Scrapers, Context, and Deep Research

Related topics: Overview and Core Architecture, Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers, Backend Server, Frontend, Deployment, and Security

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Retrievers

Continue reading this section for the full explanation and source context.

Section Scrapers

Continue reading this section for the full explanation and source context.

Research Pipeline: Retrievers, Scrapers, Context, and Deep Research

Purpose and Scope

The research pipeline is the core execution loop of GPT Researcher that turns a natural-language task into a sourced, multi-page report. It is composed of modular "skills" located under gpt_researcher/skills/ and orchestrated by a master GPTResearcher agent. The pipeline is responsible for planning sub-questions, retrieving candidate sources, scraping their content, curating a focused context, and finally writing a long-form report with citations.

The README states the design goals succinctly: *"Create a task-specific agent based on a research query. Generate questions that collectively form an objective opinion on the task. Use a crawler agent for gathering information for each question. Summarize and source-track each resource. Filter and aggregate summaries into a final research report."* Source: README.md

The pipeline supports two execution modes:

Mode	Entry point	Output
Standard (detailed) report	`DetailedReport.run()`	Introduction, TOC, subtopic reports, conclusion, references
Deep research	`deep_research()` in `deep_research.py`	Tree-like iterative exploration with `learnings` and `visited_urls`

High-Level Architecture

The pipeline follows a four-stage data flow. Each stage is a separate skill module that can be swapped or extended.

flowchart LR
    A[Query] --> B[Researcher<br/>generate sub-questions]
    B --> C[Retriever<br/>SERP search]
    C --> D[Scraper<br/>fetch & extract]
    D --> E[Context Manager<br/>curate & filter]
    E --> F[Writer<br/>draft report]
    F --> G[Cited Report]

Source: gpt_researcher/skills/researcher.py, gpt_researcher/skills/context_manager.py, gpt_researcher/skills/writer.py

Retrievers and Scrapers

Retrievers

The Researcher skill handles query planning and source discovery. It calls a configurable retriever (Tavily, SerpAPI, Bing, Google, DuckDuckGo, Searx, etc., selected via the RETRIEVER environment variable) to fetch URLs relevant to each sub-question. The retrieved URLs are then deduplicated and de-prioritized against any caller-supplied source_urls before scraping. Source: gpt_researcher/skills/researcher.py

Scrapers

Scraping is performed by the Browser skill, which dispatches to a backend chosen via the SCRAPER setting. Backends include bs (BeautifulSoup, default), browser (Selenium), nodriver, tavily_extract, firecrawl, and pymupdf for PDF files. The PyMuPDFScraper is special-cased: any URL ending in .pdf is routed through it, and its non-URL branch treats the value as a local filesystem path — a behavior that has security implications discussed below. Source: gpt_researcher/skills/browser.py, Issue #1805

Context Curation

Once scraped, raw page text is handed to the ContextManager and Curator skills. The curator groups, ranks, and trims content to fit a configurable context-window budget (controlled by TOTAL_WORDS and similar env vars), keeping only the snippets most relevant to the original sub-questions. The resulting self.context list is the sole input passed to the writer. Source: gpt_researcher/skills/context_manager.py, gpt_researcher/skills/curator.py

The detailed-report orchestrator also maintains a global_context, global_written_sections, and global_urls set so that subtopic reports stay coherent with the introduction and the final references list. Source: backend/report_type/detailed_report/detailed_report.py

Writing and Multi-Agent Coordination

The Writer skill produces the final report by combining the curated context with a system prompt and emitting Markdown that includes hyperlinks to visited_urls. In the multi-agent variant, the flow is split across an EditorAgent (outline planning), parallel ResearchAgent / ReviewerAgent / ReviserAgent runs per section, a WriterAgent for introduction/conclusion, and a Publisher for PDF/Docx/Markdown export. Source: multi_agents/agents/editor.py, multi_agents/agents/writer.py, multi_agents/README.md

A shared ResearchState TypedDict carries task, sections, research_data, headers, and sources between agents. Source: backend/memory/research.py

Deep Research Mode

Deep Research is a recursive, tree-like extension of the standard pipeline. The DeepResearchSkill generates multiple SERP queries in parallel (breadth parameter), runs them, then for each result calls process_serp_result to extract learnings and followUpQuestions. The follow-up questions are recursively fed back into the same function up to depth levels, accumulating a shared visited_urls set and a citations map keyed by URL. Source: gpt_researcher/skills/deep_research.py, backend/report_type/deep_research/example.py

A reasoning model (e.g. o3-mini at ReasoningEfforts.High) is used to extract insights, while a cheaper model handles SERP generation. The README estimates roughly five minutes and ~$0.4 per deep-research run.

Common Failure Modes and Operational Notes

Source hallucination on empty context. When self.context ends up empty (for example because every retriever/scraper call failed), write_report still asks the LLM to produce a report, which can result in fabricated citations. Mitigation: surface empty context to the caller and skip the writing step. See Issue #1572.
SSRF via source_urls. The /ws WebSocket endpoint accepts a source_urls list with no authentication or URL validation, allowing unauthenticated network attackers to coerce the scraper into making outbound requests. See Issue #1794.
Local PDF read via PyMuPDFScraper. Combined with the SSRF issue above, a .pdf value in source_urls is loaded as a local file path by PyMuPDFLoader, enabling arbitrary local file read. See Issue #1805.
Retriever reliability. Final-report quality is bounded by the underlying SERP provider; some users have requested pluggable backends (e.g. serpbase.dev). See Issue #1797.
Scraper weight. JS-rendering backends (browser, nodriver) require a full Chromium install, which has motivated lighter-weight options. See Issue #1800.

Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers

Related topics: Overview and Core Architecture, Research Pipeline: Retrievers, Scrapers, Context, and Deep Research, Backend Server, Frontend, Deployment, and Security

Section Related Pages

Continue reading this section for the full explanation and source context.

Section LangGraph Implementation (multiagents/)

Continue reading this section for the full explanation and source context.

Section AG2 Implementation (multiagentsag2/)

Continue reading this section for the full explanation and source context.

Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers

Overview

GPT Researcher ships as a modular research agent that can be extended along five major axes: Model Context Protocol (MCP) tooling, multi-agent orchestration, image generation, local document ingestion, and pluggable LLM providers. Each extension is implemented as a discrete module under the repository root or inside gpt_researcher/, allowing adopters to enable only the capabilities they need. As stated in the main README, the project provides "a full suite of customization options to create tailor made and domain specific research agents" Source: README.md.

flowchart LR
    U[User / Client] --> R[GPTResearcher Core]
    R --> MCP[MCP Module]
    R --> MA[Multi-Agent<br/>LangGraph / AG2]
    R --> IMG[Image Generation<br/>ModelsLab, Gemini]
    R --> LOC[Local Documents<br/>PyMuPDFScraper, DOCX]
    R --> LLM[LLM Provider<br/>OpenAI, etc.]
    MA --> R
    LLM --> R

MCP (Model Context Protocol) Extension

MCP enables GPT Researcher to connect to external tool servers via a standardized protocol. The project exposes MCP in two places:

A standalone MCP server has been moved to its own repository at assafelovic/gptr-mcp, exposing resources (research_resource) and tools (deep_research, quick_search, write_report, get_research_sources, get_research_context) Source: mcp-server/README.md.
An in-tree client integration lives under gpt_researcher/mcp/ and contains four cooperating components: client.py (connection management via MultiServerMCPClient), tool_selector.py (LLM-driven tool selection with a pattern-matching fallback), research.py (MCPResearchSkill which binds selected tools to an LLM), and streaming.py (WebSocket streaming and structured logging) Source: gpt_researcher/mcp/README.md.

The integration supports stdio, websocket, and HTTP transport types, handles automatic cleanup, and limits the number of tools returned per query to prevent context overhead Source: gpt_researcher/mcp/README.md. A typical configuration passes a command and args for a local server, or a URL for remote transports Source: gpt_researcher/mcp/README.md.

Multi-Agent Extension

The multi-agent extension implements the STORM-inspired pipeline described in the README, coordinating specialized agents rather than relying on a single researcher. Two implementations are shipped:

LangGraph Implementation (`multi_agents/`)

The LangGraph pipeline runs: Browser → Editor → (Researcher ↔ Reviewer ↔ Revisor) per section → Writer → Publisher. The Browser performs initial research, the Editor plans the outline (delegated in editor.py), and each outline section is researched, reviewed against guidelines (reviewer.py), revised, and finally compiled into multi-format output Source: multi_agents/README.md. The Editor generates a maximum of max_sections headers focused only on subtopics — no introduction, conclusion, or references Source: multi_agents/agents/editor.py. The Reviewer returns None when the draft satisfies all guideline criteria, otherwise emits revision notes for the Revisor Source: multi_agents/agents/reviewer.py.

AG2 Implementation (`multi_agents_ag2/`)

The AG2 port supports the same task configuration schema (query, model, source, follow_guidelines, guidelines, verbose) and adds DOC_PATH for local document research Source: multi_agents_ag2/README.md.

Both implementations can export reports to PDF and DOCX via shared utilities in backend/utils.py (write_md_to_pdf, write_md_to_docx) and multi_agents/agents/utils/file_formats.py Source: backend/utils.py, multi_agents/agents/utils/file_formats.py.

Image Generation Extension

Image generation is documented as a top-level feature in the README and includes two modes:

Smart image scraping and filtering for relevant visuals in the final report.
AI-generated inline images using Google Gemini (Nano Banana) for visual illustrations.

Release v3.5.0 added the ModelsLab image generation provider Source: README.md. PDF export pre-processes markdown image references (e.g. /outputs/images/...) into absolute file:// paths that WeasyPrint can resolve Source: backend/utils.py.

Local Documents Extension

Local research is supported by setting source to "local" and providing a DOC_PATH environment variable Source: multi_agents_ag2/README.md. PDF ingestion is handled by PyMuPDFScraper, which after v3.4.3 reads all PDF pages instead of only the first page Source: v3.4.3 release notes. Community discussions (issues #1794 and #1805) highlight that the WebSocket entrypoint accepts a caller-supplied source_urls list without authentication, and that .pdf entries in that list are routed to PyMuPDFScraper's local-file branch — an SSRF / arbitrary local read risk that operators should mitigate when exposing the server.

LLM Providers

LLM access is abstracted through LLM_PROVIDER and MODEL environment variables. Deep research uses an O3_MINI_MODEL reasoning model with a configurable ReasoningEfforts value (e.g. High) for the analysis step that converts raw context into structured learnings and follow-up questions Source: backend/report_type/deep_research/example.py. The README lists OpenAI as the default and Tavily as the default web retriever, while the v3.5.0 release notes confirm additional retrievers and models are now supported Source: README.md.

Backend Server, Frontend, Deployment, and Security

Related topics: Overview and Core Architecture, Research Pipeline: Retrievers, Scrapers, Context, and Deep Research, Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LL...

Section Related Pages

Continue reading this section for the full explanation and source context.

Section MCP Server

Continue reading this section for the full explanation and source context.

Backend Server, Frontend, Deployment, and Security

Overview

GPT Researcher is delivered as a multi-component application: a Python backend service, one or more optional frontends, an MCP server for assistant integrations, and multi-agent orchestration modules. This page documents the runtime surfaces (the FastAPI/WebSocket backend, the FastAPI-served static and NextJS frontends, and the MCP server), the supported deployment paths (local install and Docker Compose), and the security posture of the public WebSocket entrypoint as observed in the source and community reports.

Backend Server

The backend is a FastAPI + Uvicorn service that exposes both REST and WebSocket surfaces. Dependencies are pinned in backend/requirements.txt and include fastapi>=0.104.1, uvicorn>=0.24.0, websockets>=13.1, pydantic>=2.5.1, langchain>=1.0.0, tavily-python>=0.7.12, httpx>=0.28.1, aiofiles, mistune, md2pdf, python-docx, htmldocx, and jinja2 (backend/requirements.txt). The backend is the host process for single-agent and multi-agent research runs and for the /ws WebSocket endpoint consumed by the frontends.

Report generation utilities live in backend/utils.py, which exposes helpers for converting Markdown into PDF, DOCX (via mistune → HtmlToDocx → python-docx), and other formats (backend/utils.py). The same conversion helpers are also exposed through the multi-agents module at multi_agents/agents/utils/file_formats.py, which mirrors the mistune → Document → doc.save(file_path) pipeline (multi_agents/agents/utils/file_formats.py).

Frontend Applications

The repository ships two interchangeable frontends, both described in frontend/README.md:

Static Frontend (FastAPI) — A lightweight HTML/CSS/JS UI served by FastAPI. Setup is pip install -r requirements.txt followed by python -m uvicorn main:app, listening on http://localhost:8000.
NextJS Frontend — A feature-rich React/Next.js client pinned to Node.js v18.17.0. Setup uses npm install --legacy-peer-deps and npm run dev, listening on http://localhost:3000, and requires the FastAPI backend on localhost:8000.

The top-level README.md summarizes the frontends as "lightweight (HTML/CSS/JS) and production-ready (NextJS + Tailwind) versions" and directs operators to the documentation page for setup details.

MCP Server

In addition to the two web frontends, the project exposes an MCP (Model Context Protocol) server so assistants like Claude can invoke research tools. As noted in mcp-server/README.md, the canonical home for the server has moved to assafelovic/gptr-mcp, but the original source documents the tools: deep_research, quick_search, write_report, get_research_sources, get_research_context, and the research_resource resource.

Deployment

The README documents two primary deployment paths. The first is a direct local install: clone the repo, create a virtual environment with Python 3.11+, and set OPENAI_API_KEY and TAVILY_API_KEY (with LANGCHAIN_TRACING_V2 and LANGCHAIN_API_KEY optional for LangSmith observability) (README.md).

The second is Docker Compose. Per the README, operators copy .env.example to .env, supply API keys, comment out unneeded services inside docker-compose.yml, and run docker-compose up --build (or docker compose up --build). By default, two processes are started: the Python server on localhost:8000 and the React app on localhost:3000 (README.md).

For multi-agent research, the multi_agents/README.md module ships its own pipeline (Browser → Editor → parallel Researcher/Reviewer/Reviser → Writer → Publisher) driven by python main.py and configured via a task.json file. An alternative orchestration is provided under multi_agents_ag2/README.md, which accepts the same query, max_sections, source (web or local), follow_guidelines, guidelines, and verbose parameters.

flowchart LR
    User -->|HTTP/WS| FE[Frontend: FastAPI static or NextJS]
    FE -->|WS /ws| BE[FastAPI Backend: port 8000]
    BE --> GPTR[GPTResearcher core]
    GPTR -->|search| Tavily[(Tavily / search provider)]
    GPTR -->|scrape| Scrapers[bs / Selenium / Firecrawl / PyMuPDF]
    GPTR -->|LLM| LLM[(OpenAI / compatible)]
    GPTR --> BE
    BE -->|PDF/DOCX/MD| FE
    MCP[MCP Server] -->|tools| BE
    MA[multi_agents / multi_agents_ag2] --> GPTR

Security

The public-facing attack surface is concentrated at the backend's WebSocket endpoint. Community issue assafelovic/gpt-researcher#1794 reports that the /ws endpoint accepts a caller-supplied source_urls list with no authentication and no URL validation, enabling unauthenticated Server-Side Request Forgery (SSRF) against any network the backend can reach. A related report, assafelovic/gpt-researcher#1805, describes an unauthenticated arbitrary local PDF file read: any source_urls entry ending in .pdf is routed to PyMuPDFScraper, whose non-URL branch forwards the value to PyMuPDFLoader as a local filesystem path.

Operators deploying GPT Researcher on any network reachable by untrusted clients should therefore place the backend behind authentication and an egress allowlist, strip or validate source_urls, and run the service with the least filesystem privileges required. A separate content-quality concern is documented in assafelovic/gpt-researcher#1572: when no relevant context is collected, the report generator may emit plausible-but-fabricated sources, so downstream consumers should not treat citations as authoritative without verification.

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 13 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

1. Security or permission risk: Security or permission risk requires verification

Severity: high
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1794

2. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.host_targets | https://github.com/assafelovic/gpt-researcher

3. Configuration risk: Configuration risk requires verification

Severity: medium
Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1797

4. Capability evidence risk: Capability evidence risk requires verification

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: capability.assumptions | https://github.com/assafelovic/gpt-researcher

5. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1807

6. Maintenance risk: Maintenance risk requires verification

Severity: medium
Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/assafelovic/gpt-researcher

7. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: downstream_validation.risk_items | https://github.com/assafelovic/gpt-researcher

8. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: no_demo
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: risks.scoring_risks | https://github.com/assafelovic/gpt-researcher

9. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1800

10. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1805

11. Security or permission risk: Security or permission risk requires verification

Severity: medium
Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: community_evidence:github | https://github.com/assafelovic/gpt-researcher/issues/1801

12. Maintenance risk: Maintenance risk requires verification

Severity: low
Finding: issue_or_pr_quality=unknown。
User impact: May increase setup, validation, or first-run risk for the user.
Recommended check: Reproduce the official install and quickstart path in an isolated environment.
Evidence: evidence.maintainer_signals | https://github.com/assafelovic/gpt-researcher

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 10

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using gpt-researcher with real data or production workflows.

Report Generator Halluzinates Sources - github / github_issue
Website is broken - github / github_issue
[[Security] Unauthenticated Server-Side Request Forgery (SSRF)](https://github.com/assafelovic/gpt-researcher/issues/1794) - github / github_issue
Unauthenticated arbitrary local PDF file read via source_urls and the Py - github / github_issue
🌐 Runtime Open Federation — federate GPT-Researcher as a research node, - github / github_issue
Feature: Add persistent behavioral memory via Agent Magnet integration - github / github_issue
Add Obscura as a scraper backend - github / github_issue
Enhancement: Consider serpbase.dev as an alternative web search provider - github / github_issue
v3.5.0 - github / github_release
Configuration risk requires verification - GitHub / issue

Source: Project Pack community evidence and pitfall evidence

gpt-researcher

Overview and Core Architecture

Related Pages

Overview and Core Architecture

Purpose and Scope

High-Level Architecture

Multi-Agent Workflow

Configuration, Frontend, and Deployment

Security and Reliability Notes

See Also

Research Pipeline: Retrievers, Scrapers, Context, and Deep Research

Related Pages

Research Pipeline: Retrievers, Scrapers, Context, and Deep Research

Purpose and Scope

High-Level Architecture

Retrievers and Scrapers

Retrievers

Scrapers

Context Curation

Writing and Multi-Agent Coordination

Deep Research Mode

Common Failure Modes and Operational Notes

See Also

Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers

Related Pages

Extensions: MCP, Multi-Agent, Image Generation, Local Documents, and LLM Providers

Overview

MCP (Model Context Protocol) Extension

Multi-Agent Extension

LangGraph Implementation (`multi_agents/`)

AG2 Implementation (`multi_agents_ag2/`)

Image Generation Extension

Local Documents Extension

LLM Providers

See Also

Backend Server, Frontend, Deployment, and Security

Related Pages

Backend Server, Frontend, Deployment, and Security

Overview

Backend Server

Frontend Applications

MCP Server

Deployment

Security

See Also

Doramagic Pitfall Log

Doramagic Pitfall Log

1. Security or permission risk: Security or permission risk requires verification

2. Configuration risk: Configuration risk requires verification

3. Configuration risk: Configuration risk requires verification

4. Capability evidence risk: Capability evidence risk requires verification

5. Maintenance risk: Maintenance risk requires verification

6. Maintenance risk: Maintenance risk requires verification

7. Security or permission risk: Security or permission risk requires verification

8. Security or permission risk: Security or permission risk requires verification

9. Security or permission risk: Security or permission risk requires verification

10. Security or permission risk: Security or permission risk requires verification

11. Security or permission risk: Security or permission risk requires verification

12. Maintenance risk: Maintenance risk requires verification

Community Discussion Evidence

Community Discussion Evidence