Doramagic Project Pack · Human Manual

yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance

YaCy Overview & System Architecture

Related topics: Installation, Deployment & Configuration, Core Subsystems: Crawler, Index, Search & AI/LLM, HTTP APIs, Administration UI & Common Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Installation, Deployment & Configuration, Core Subsystems: Crawler, Index, Search & AI/LLM, HTTP APIs, Administration UI & Common Operations

YaCy Overview & System Architecture

Purpose and Scope

YaCy is a full peer-to-peer (P2P) search engine application. According to the project's README, it bundles three primary capabilities in a single deployable artifact: a search server hosting an index, a web-based front end for both searching and index administration, and a production-ready web crawler with a scheduler that keeps the index fresh. Source: README.md.

YaCy peers can operate either as a standalone instance (intranet, enterprise replacement for commercial search appliances) or as participants in a global cluster where index entries are exchanged with other peers over a built-in P2P protocol. Source: README.md. The "operation mode" is selectable from the web interface, and individual searches can opt out of network lookups to keep them local-only — a design choice explicitly motivated by privacy. Source: README.md.

The codebase is mostly GPLv2+ Java, with some LGPL-licensed components. Source: README.md. The standard entry point for the application is net.yacy.yacy, which StartFromJava.java targets when launching the server from an IDE: private String cmdStart = "./startYACY.sh". Source: source/net/yacy/utils/StartFromJava.java:18.

High-Level Architecture

YaCy follows a layered architecture: an embedded HTTP server fronts a set of serverObjects–driven servlets that expose both human pages and machine APIs, while background workers handle crawling, indexing, peer-to-peer exchange, and UPnP mapping.

flowchart TB
    subgraph Client["External Clients / Peers"]
        U[Browser User]
        API[API Consumer]
        Peer[Other YaCy Peers]
    end
    subgraph Edge["Network Layer"]
        UPnP["UPnP Mapping<br/>HTTP / HTTPS"]
    end
    subgraph Server["YaCy Server Core"]
        HTTP["Embedded HTTP Server<br/>(TemplateEngine + serverObjects)"]
        SB["Switchboard<br/>(orchestrator)"]
        Crawler["Web Crawler + Scheduler"]
        Index["Search Index / RWI Store"]
        BM["BookmarksDB<br/>(xbel export)"]
        Wiki["Wiki Code Parser"]
    end
    U --> HTTP
    API --> HTTP
    Peer <--> SB
    UPnP --> HTTP
    HTTP --> SB
    SB --> Crawler
    SB --> Index
    SB --> BM
    SB --> Wiki
    HTTP -.templates.-> Template["htroot/**/*.html"]

Key architectural elements visible in the source tree:

Request Lifecycle and Servlets

A typical YaCy request flows from the embedded HTTP server into a servlet named after the request path. Each servlet's respond(RequestHeader, serverObjects post, serverSwitch env) method produces a serverObjects map that the template engine substitutes into the matching htroot/*.html file.

The serverObjects helper not only stores key/value pairs but also performs content-type–aware encoding (HTML escaping, wiki parsing, byte handling) before they reach the renderer. For example, putWiki(key, wikiCode) routes content through Switchboard.wikiParser.transform(...), so any text the UI displays as wiki markup is parsed server-side before substitution. Source: source/net/yacy/server/serverObjects.java.

This pattern is consistent across feature servlets:

Servlet (Java)Template (HTML)PurposeAuth gate
net.yacy.htroot.api.yacydochtroot/api/yacydoc.htmlDocument metadata (Dublin Core + YaCy fields)Not required
net.yacy.htroot.api.share(POST handler)Receive binaries shared by other peersPeer-to-peer trust
net.yacy.htroot.api.bookmarks.xbel.xbelxbel outputExport bookmarks as XBEL XMLverifyAuthentication
net.yacy.htroot.api.bookmarks.posts.getget outputList bookmark posts by tag/dateverifyAuthentication
net.yacy.htroot.api.bookmarks.posts.delete_pdelete_p outputDelete a bookmark by URL or hashverifyAuthentication

Sources: source/net/yacy/htroot/api/yacydoc.java, source/net/yacy/htroot/api/share.java, source/net/yacy/htroot/api/bookmarks/xbel/xbel.java, source/net/yacy/htroot/api/bookmarks/posts/get.java, source/net/yacy/htroot/api/bookmarks/posts/delete_p.java.

Authentication is centralized: servlets that mutate user state call switchboard.verifyAuthentication(header) and, on failure, signal the template via prop.authenticationRequired(). Source: source/net/yacy/htroot/api/bookmarks/posts/delete_p.java:18. The citation page at htroot/api/citation.html illustrates the UI's toggleable filter pattern (#(filter)#::Cited#(/filter)#) that the template engine resolves against serverObjects. Source: htroot/api/citation.html.

Crawler Control and External Surface

The crawler is exposed through the web UI (e.g., the "Crawl Start" panel referenced in the README's screenshots) and produces the RWI (reverse word index) entries that feed the search API. YaCy's startup scripts live alongside the source tree: StartFromJava shells out to ./startYACY.sh and ./stopYACY.sh, reflecting the conventional Linux deployment path. Source: source/net/yacy/utils/StartFromJava.java:18.

External integration points include:

  • Robots-style access control. RobotsTxtConfig defines named servlet families (WIKI, BLOG, BOOKMARKS, HOMEPAGE, FILESHARE, SURFTIPS, NEWS, STATUS, LOCKED, DIRS, NETWORK, PROFILE, ALL) and a default-deny list (e.g., lockedDisallowed = true, dirsDisallowed = true, profileDisallowed = true). The class is constructed from a String[] active array passed in by the server switch. Source: source/net/yacy/server/http/RobotsTxtConfig.java.
  • Translation pipeline. GenerateMasterXliff reads YaCy's custom *.lng files (default folder locales/) and emits a master XLIFF file for translators. Source: source/net/yacy/utils/translation/GenerateMasterXliff.java:38.
  • API discovery. The README documents that any page marked with the orange "API" icon exposes an XML/JSON twin, and that the scripts under /bin are simple wrappers around those HTTP endpoints — making shell scripting a first-class integration path. Source: README.md.

Known Operational Pain Points

The community context surfaces several recurring architectural concerns that map directly to components described above:

  • Port configuration (issue #315, #791). Users repeatedly request separating the peer-to-peer data exchange port from the web UI port and supporting a public port that differs from the listening port (e.g., behind Traefik). Source: community context for issue #315 and issue #791. UPnP.java currently opens only the listening port via the IGD gateway. Source: source/net/yacy/utils/upnp/UPnP.java.
  • Version string in containers (issue #796). When .git metadata is missing, the build emits a broken yacy_v1.941_fatal: not a git repository... banner; this affects only the visible version, not core functionality. Source: community context for issue #796.
  • Build issues (issue #788). Some users hit Ivy-resolution failures during ant builds; the build.xml references install-ivy to bootstrap. Source: community context for issue #788.
  • Authentication flakiness (issue #798). Sessions occasionally drop, with users reporting simultaneous "failed RSS" popups — relevant to the verifyAuthentication flow used in bookmark and admin servlets. Source: community context for issue #798.
  • Environment-variable configuration (issue #794). YACY_-prefixed environment variables bind to the configuration store but only when the underlying key matches the variable naming convention, so camelCase keys such as staticIP or browserPopUpTrigger are not picked up. Source: community context for issue #794.
  • Usability for non-technical users (issue #350). Long-running concern that the same UI must serve both novice and advanced operators, motivating requests for a simpler companion app. Source: community context for issue #350.

See Also

Sources: source/net/yacy/htroot/api/yacydoc.java, source/net/yacy/htroot/api/share.java, source/net/yacy/htroot/api/bookmarks/xbel/xbel.java, source/net/yacy/htroot/api/bookmarks/posts/get.java, source/net/yacy/htroot/api/bookmarks/posts/delete_p.java.

Installation, Deployment & Configuration

Related topics: YaCy Overview & System Architecture, HTTP APIs, Administration UI & Common Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 2.1 Network and Port Configuration

Continue reading this section for the full explanation and source context.

Section 2.2 UPnP Configuration

Continue reading this section for the full explanation and source context.

Section 2.3 robots.txt and Crawler Politeness

Continue reading this section for the full explanation and source context.

Related topics: YaCy Overview & System Architecture, HTTP APIs, Administration UI & Common Operations

Installation, Deployment & Configuration

YaCy is delivered as a full search engine application composed of an embedded HTTP server, a peer-to-peer network layer, a crawler/scheduler, and a browser-based admin UI. The repository contains the GPLv2+ source under /source, HTML templates and servlets under /htroot, helper shell scripts under /bin, and build descriptors for Apache Ant plus Apache Ivy dependency resolution.

This page documents how the application is built, packaged, configured, and deployed, with a focus on patterns evidenced directly in the source tree and on recurring deployment scenarios discussed by the community.

1. Build, Packaging, and First Run

YaCy uses Apache Ant with Apache Ivy as the dependency manager. The build pipeline produces a runnable distribution containing the compiled classes, the /htroot template tree, locale files, and starter scripts. Source: README.md.

# Typical build flow (Ant + Ivy)
ant clean            # clean previous build
ant                  # resolve dependencies via Ivy and compile
ant release          # assemble the distribution package

Once built, the distribution is launched through the shell scripts in /bin (e.g. startYACY.sh, stopYACY.sh, restartYACY.sh). These scripts wrap the JVM startup and pass configuration parameters to the main net.yacy.yacy entry point. Community users have reported build failures on fresh checkouts (see Issue #788), typically caused by missing Ivy cache entries or by running the build outside the repository root.

The web interface listens on a configurable port (default 8090) and exposes both human-facing pages and the XML/JSON APIs. The README notes: "Click it [the orange API icon], and you will see the XML/JSON version of the respective webpage." Source: README.md.

2. Runtime Configuration

YaCy persists configuration in a key-value store (the DATA/ directory) and supports override through Java system properties, environment variables prefixed YACY_, and command-line arguments. The community has documented that environment-variable overrides work for simple keys like port or upnp.enabled, but fail for camelCase keys such as staticIP or browserPopUpTrigger because the variable-name normalization step only lowercases and does not split on case boundaries (Issue #794).

# Example: override the listening port via environment variable
export YACY_port=8090
export YACY_upnp_enabled=true
./bin/startYACY.sh

2.1 Network and Port Configuration

The peer-to-peer port and the web interface port share a single configurable listener. This coupling is the root of a long-standing feature request — see Issue #315 — and continues to surface when operators deploy behind reverse proxies (Issue #791). When YaCy is fronted by Traefik, NGINX, or a Kubernetes Ingress, the operator must set staticIP to the externally reachable host so that the peer announces the correct address to the DHT.

flowchart LR
    Client[Browser / API client] --> Proxy[Reverse proxy :443]
    Proxy --> Yacy[YaCy peer :8090]
    Yacy <-->|P2P protocol| Peers[Other YaCy peers]
    Yacy --> Crawler[Local crawler + index]

2.2 UPnP Configuration

For deployments on consumer networks, YaCy can request automatic port forwarding through UPnP gateways. The configuration is modeled in source/net/yacy/utils/upnp/UPnP.java, with protocol-specific mappings defined in UPnPMappingType.java (an enum currently containing only HTTP and HTTPS). The concrete mapping holder — UPnPMapping.java — stores the CONFIG_PORT_KEY and CONFIG_ENABLED_KEY strings used to look up the corresponding entries in yacy.config. The discovery itself relies on the weupnp library and is initialized lazily during switchboard startup.

2.3 robots.txt and Crawler Politeness

The web-facing surface exposes a virtual /robots.txt whose directives are derived from the RobotsTxtConfig class. Source: RobotsTxtConfig.java. Each protected area (WIKI, BLOG, BOOKMARKS, HOMEPAGE, FILESHARE, SURFTIPS, NEWS, STATUS, LOCKED, DIRS, NETWORK, PROFILE) has a default boolean; for instance lockedDisallowed and dirsDisallowed default to true, while wikiDisallowed and blogDisallowed default to false. Operators can opt-in or out by toggling these flags at startup.

2.4 Locale and Translation Files

Custom translations live in YaCy's .lng format and are merged into XLIFF master files via GenerateMasterXliff.java. The tool walks a locales directory, parses each .lng file, and updates the master XLIFF so that translators can pick up new keys without losing existing translations.

3. Deployment Patterns

3.1 Bare-Metal and Local Installation

The recommended path for end users is the pre-built installer referenced from the README's badges. It unpacks a self-contained Java runtime plus YaCy distribution and registers the start scripts as services (or user-level launchers). On Windows, an admin install is required because the package binds to a privileged port and writes to Program Files; this requirement has tripped up several first-time users (Issue #790).

3.2 Containerized Deployment

A container image (published alongside the repository) runs YaCy with a single network namespace. Two recurring issues have been observed:

  1. Broken version string — the in-image git describe invocation fails because the working copy does not contain a .git directory, producing strings like yacy_v1.941_fatal: not a git repository .... See Issue #796. The fix is to inject the version explicitly at image build time (e.g. via ARG YACY_VERSION).
  2. Reverse-proxy port mismatch — operators must set staticIP and the externally visible port; otherwise the peer advertises the internal container port (Issue #791).

3.3 API-Only / Headless Deployment

Several servlets under htroot/api/ (e.g. getpageinfo_p, linkstructure, share, push_p) are usable programmatically without a browser session. The share servlet accepts binary uploads through multipart form data and writes them under a controlled directory tree — see source/net/yacy/htroot/api/share.java. The corresponding UI is rendered from htroot/api/push_p.html and htroot/api/yacydoc.html. These endpoints are convenient for integrating YaCy with content management systems that push newly changed documents directly into the index.

4. Common Failure Modes

SymptomLikely causeReference
Build fails with Ivy errors on first runMissing dependency cache or offline modeIssue #788
Environment variable override silently ignoredcamelCase key not normalizedIssue #794
Peer announces internal portstaticIP not set behind reverse proxyIssue #791
Version string contains "fatal: not a git repository"Container image lacks .git directoryIssue #796
Sessions disappear / RSS feed failsCookie or Tor browser interaction; check ConfigBasic_authenticationIssue #798
UPnP port mapping rejectedGateway does not support weupnp discoverysource/net/yacy/utils/upnp/UPnP.java

For operators who want a deployment with separate ports for peer-to-peer traffic and the web UI, Issue #315 tracks the design discussion; until that lands, the recommended workaround is to run two YaCy instances with different roles (one Network.unit and one Intranet) or to terminate the P2P port at a forwarding proxy.

See Also

  • Architecture Overview — high-level component map
  • API Reference — full servlet list under /htroot/api/
  • P2P Network Protocol — peer ping, DHT, and message exchange
  • Crawler & Indexing — crawl profiles and scheduler

Source: https://github.com/yacy/yacy_search_server / Human Manual

Core Subsystems: Crawler, Index, Search & AI/LLM

Related topics: YaCy Overview & System Architecture, HTTP APIs, Administration UI & Common Operations

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: YaCy Overview & System Architecture, HTTP APIs, Administration UI & Common Operations

Core Subsystems: Crawler, Index, Search & AI/LLM

Overview

YaCy is a peer-to-peer search engine composed of four tightly coupled subsystems: the Crawler, which fetches documents from HTTP, FTP, SMB, file, and RSS sources; the Index, which stores parsed content, metadata, and the hyperlink graph; the Search layer, which exposes query APIs and serves both human and machine clients; and the emerging AI/LLM layer, which augments search with citation, summarization, and document-relationship analysis. Each subsystem is rooted in dedicated Java classes under net.yacy.crawler, net.yacy.search, and net.yacy.htroot.api, and is orchestrated by the central Switchboard described in the README.md.

Community users frequently interact with this stack — both the performance tuning of the crawler (issue #799) and the separate-port request for peer data exchange vs. web interface (issue #315) reflect how the crawler/index boundaries are felt at deployment time.

Crawler Subsystem

The crawler is anchored by CrawlSwitchboard, which schedules and dispatches URL fetches. Each protocol has a dedicated loader, all residing under source/net/yacy/crawler/retrieval/:

LoaderProtocolPurpose
HTTPLoaderHTTP/HTTPSPrimary web crawler
FTPLoaderFTPLegacy file transfer indexing
SMBLoaderSMB/CIFSIntranet file shares (replacement for enterprise search)
FileLoaderfile://Local filesystem ingestion
RSSLoaderRSS/AtomFeed-driven re-crawling; tied to the RSS authorization bug discussed in issue #798

Source: README.md — confirms the multi-protocol coverage ("a network scanner makes it easy to discover all available HTTP, FTP and SMB servers").

Robot exclusion is enforced through RobotsTxtConfig, which defines per-section flags (wiki, blog, bookmarks, homepage, fileshare, surftips, news, status, locked, dirs, network, profile, all). Sections defaulting to disallowed (locked, dirs, profile) prevent the crawler from indexing sensitive administrative surfaces. Source: source/net/yacy/server/http/RobotsTxtConfig.java.

Crawl results, including the in-progress RowHandleSet referenced in status_p, are streamed back into the index pipeline, and crawl profiles (see CrawlProfile import in the same file) persist operator-supplied depth/freshness rules.

flowchart LR
    A[Scheduler / CrawlProfile] --> B[CrawlSwitchboard]
    B --> C{HTTPLoader}
    B --> D{FTPLoader}
    B --> E{SMBLoader}
    B --> F{FileLoader}
    B --> G{RSSLoader}
    C --> H[Index: kelondro + HyperlinkGraph + Fulltext]
    D --> H
    E --> H
    F --> H
    G --> H
    H --> I[Search / API layer]
    I --> J[AI/LLM augmentation]

Index Subsystem

The index is built on the kelondro low-level row-store library, with three higher-level abstractions:

  • Word/reference index — the canonical full-text index backing the public search.
  • Hyperlink graph — modeled by HyperlinkEdge and HyperlinkGraph, enumerated by HyperlinkType. The linkstructure API walks this graph for a given URL.
  • Fulltext — accessed as sb.index.fulltext() inside linkstructure.

Source: source/net/yacy/htroot/api/linkstructure.java — builds HyperlinkGraph hlg = new HyperlinkGraph() and traverses it within post.getInt("maxtime", 60000) and post.getInt("maxnodes", 10000) budget limits. These caps grow when the caller is an authenticated admin (authenticated ? 300000 : 1000 for time, 10000000 : 100 for nodes), making this endpoint a useful probe for graph health and a frequent target for performance questions like issue #799.

Search Subsystem & AI/LLM Hooks

The search layer is exposed through servlet endpoints in net.yacy.htroot.api and rendered via HTML templates in htroot/api/. Key endpoints:

  • yacydoc — retrieves a single indexed document by URL hash and fills the yacydoc.html template (#[yacy_words]#, #[yacy_inbound]#, #[yacy_outbound]#, #[yacy_citations]#).
  • getpageinfo_p — the supported replacement for the deprecated getpageinfo; it parses the result DOM and extracts microdata/RDFa.
  • linkstructure — emits XML/JSON of inbound/outbound neighbors.
  • citation.html — renders cited sentences extracted from indexed pages, the seed for citation-aware features.

The serverObjects helper class standardizes template substitution and includes putWiki(...) variants that pipe content through Switchboard.wikiParser — a critical seam for transforming user-supplied wiki markup before it reaches the AI summarization or LLM-prompt templating path. Source: source/net/yacy/server/serverObjects.java.

AI/LLM augmentation is not exposed as a dedicated class in the indexed source, but the citation, linkstructure, and wiki-parsing primitives are the documented building blocks that any integrated model would consume. Operators preparing containerized peers should note the broken version string seen on containerized peers (issue #796), which can confuse LLM-driven health probes that key off the version banner.

Common Failure Modes

  • CPU under-utilization / slow indexing — see issue #799; operators raise busy/match/httpcrawler worker counts in CrawlSwitchboard.
  • RSS auth dropping — see issue #798; RSS feeds may require re-authentication when the session cookie is not propagated to RSSLoader.
  • Port confusion behind reverse proxies — see issue #791; peer-to-peer traffic and the user-facing API currently share the single port configuration key.

See Also

Source: https://github.com/yacy/yacy_search_server / Human Manual

HTTP APIs, Administration UI & Common Operations

Related topics: YaCy Overview & System Architecture, Installation, Deployment & Configuration, Core Subsystems: Crawler, Index, Search & AI/LLM

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 2.1 Servlet Routing and Response Model

Continue reading this section for the full explanation and source context.

Section 2.2 Content Negotiation via File Suffix

Continue reading this section for the full explanation and source context.

Section 2.3 HTTP Proxy Handler

Continue reading this section for the full explanation and source context.

Related topics: YaCy Overview & System Architecture, Installation, Deployment & Configuration, Core Subsystems: Crawler, Index, Search & AI/LLM

HTTP APIs, Administration UI & Common Operations

1. Purpose and Scope

YaCy exposes its full feature surface through an HTTP-based control plane that simultaneously serves three audiences: peer-to-peer crawlers exchanging index data, the operator administering the local peer through a browser, and external clients (including the bundled shell scripts under /bin) that integrate with YaCy programmatically. The README states that all built-in interfaces are based on HTTP/XML and HTTP/JSON, and that every web page in the YaCy web interface has an XML or JSON equivalent accessible via the orange "API" icon (README.md).

This page documents how that control plane is built: the servlet invocation model, the request/response objects, the template engine that powers the admin UI, and the most commonly used API endpoints for status, bookmarks, citation lookup, document metadata, link structure, and content push.

2. HTTP API Architecture

2.1 Servlet Routing and Response Model

YaCy uses a uniform servlet pattern in which every API entry point is a Java class with a static respond(RequestHeader, serverObjects, serverSwitch) method that returns a serverObjects instance populated with template variables (source/net/yacy/htroot/api/status_p.java, source/net/yacy/htroot/api/linkstructure.java). The serverObjects class is a key-value map that supports typed accessors (getInt, getBoolean), counter increments (inc), and wiki-aware text insertion via putWiki (source/net/yacy/server/serverObjects.java). Forms and query strings are decoded through getParams, which strips the UTF-8 byte order mark and returns String[] for repeated fields.

2.2 Content Negotiation via File Suffix

The same Java servlet is reused for HTML, JSON, and XML output by selecting the matching template file under htroot/. For example, the bookmarks folder API is wired to get_folders.html, get_folders.json, get_folders.xml, and get_folders.rss (source/net/yacy/htroot/api/bookmarks/get_folders.java). The bookmarks listing API honors a display parameter that switches between flexigrid, xbel, and rss payloads in addition to the default XML (source/net/yacy/htroot/api/bookmarks/get_bookmarks.java).

2.3 HTTP Proxy Handler

For outbound traffic, HTTPDProxyHandler provides a transparent servlet that proxies GET/HEAD/POST requests, normalizes headers, and disables encodings (such as gzip) that interfere with the indexer (source/net/yacy/server/http/HTTPDProxyHandler.java). The class handles connection failures (BindException, ConnectException, NoRouteToHostException, SocketTimeoutException, UnknownHostException) so the crawler can degrade gracefully when remote hosts are unreachable.

3. Administration UI and Templating

3.1 Template Engine

The admin UI is rendered by a server-side template engine in source/net/yacy/server/http/TemplateEngine.java (source/net/yacy/server/http/TemplateEngine.java). Templates live as plain HTML in htroot/ and use YaCy's #[key]#, #{list}#…#{/list}#, and #(if)#…#(/if)# syntax to bind values from the serverObjects map returned by the matching servlet. Wiki code (e.g., for shared HTML fragments) is transformed at write time through Switchboard.wikiParser, which lets developers embed reusable snippets without duplicating markup (source/net/yacy/server/serverObjects.java).

3.2 Robots.txt Policy

The administration UI is protected by a configurable robots.txt policy in RobotsTxtConfig. By default, crawler access is denied to profile, dirs, and locked, while wiki, blog, bookmarks, fileshare, homepage, surftips, news, status, and network remain reachable (source/net/yacy/server/http/RobotsTxtConfig.java). Operators can flip a global allDisallowed flag or toggle individual sections without restarting the server.

3.3 UPnP and NAT Traversal

The administration UI exposes port-mapping controls through the UPnP utility, which discovers gateways, lists existing PortMappingEntry objects, and adds/removes mappings using the bitlet weupnp library (source/net/yacy/utils/upnp/UPnP.java). This is the surface referenced by the community issue requesting a configurable public port distinct from the listening port (community issue #791, community issue #315).

3.4 Translation Workflow

The admin UI supports locale files in a custom .lng format. GenerateMasterXliff merges these files into a single master XLIFF document for translators, updating any existing master file in place rather than overwriting it (source/net/yacy/utils/translation/GenerateMasterXliff.java). Operators regenerate the master file after adding new translatable keys.

4. Common Operations

4.1 Status and Version

status_p returns peer health, memory consumption, crawl queues, and indexing statistics in a single payload (source/net/yacy/htroot/api/status_p.java). Container deployments have observed that the version string is derived from a git describe invocation; when the container lacks .git, the suffix becomes fatal: not a git repository, surfacing verbatim in the peer announcement (community issue #796).

4.2 Bookmarks, Folders, and XBEL

The bookmarks API is split across four servlets. get_folders enumerates bookmark folders and resolves the caller's identity through the admin path or UserDB (source/net/yacy/htroot/api/bookmarks/get_folders.java). get_bookmarks paginates results and renders them as flexigrid, XBEL, RSS, or XML (source/net/yacy/htroot/api/bookmarks/get_bookmarks.java). posts/get filters bookmarks by tag and date, optionally emitting an extended XML payload (source/net/yacy/htroot/api/bookmarks/posts/get.java). xbel provides a standalone XBEL serializer used by browser integrations (source/net/yacy/htroot/api/bookmarks/xbel/xbel.java).

yacydoc renders Dublin Core metadata (dc_title, dc_creator, dc_description, dc_subject, dc_publisher, dc_date, dc_type) along with YaCy-specific fields (yacy_urlhash, yacy_words, yacy_inbound, yacy_outbound, yacy_citations) and a clickable OpenStreetMap preview (htroot/api/yacydoc.html). citation lists the sentences in a document that are cited by other indexed pages, with a toggle button to filter out non-cited sentences (htroot/api/citation.html). linkstructure queries the hyperlink graph to return inbound and outbound anchors encoded as Base64-ordered edge descriptors (source/net/yacy/htroot/api/linkstructure.java).

4.4 Page Info and Content Push

getpageinfo is marked @Deprecated in favor of the new getpageinfo_p endpoint (source/net/yacy/htroot/api/getpageinfo.java). push_p accepts multipart/form-data uploads with paired data-N and url-N fields, supporting synchronous and commit flags for CMS-style direct indexing (htroot/api/push_p.html).

flowchart LR
    Client[HTTP Client / Browser] --> HTTPD[YaCy HTTPD]
    HTTPD --> Servlet[API Servlet\nrespond header, post, env]
    Servlet --> SO[serverObjects]
    SO --> TE[TemplateEngine]
    TE --> HTTPD
    HTTPD --> Client
    Servlet -->|outbound fetch| Proxy[HTTPDProxyHandler]
    Proxy --> Remote[(Remote Host)]
    Admin[Admin UI / API] -->|robots.txt| Robots[RobotsTxtConfig]
    Admin -->|port mapping| UPnP[UPnP]
    Admin -->|locale build| Xliff[GenerateMasterXliff]

4.5 Configuration Caveats

Operators configuring YaCy through environment variables should be aware that some camelCase settings (such as staticIP and browserPopUpTrigger) do not currently bind from YACY_* exports, even though simple keys like port and upnp.enabled do (community issue #794). The Kubernetes Helm chart published under charts/ documents the supported values, including image repository and tag overrides, and explains how to build local container images for development (charts/README.md). Operators running YaCy behind Tor have reported transient authentication drops followed by a "failed RSS" popup, suggesting session cookie handling may need to be revisited when Tor Browser is in use (community issue #798).

See Also

Source: https://github.com/yacy/yacy_search_server / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

high Configuration risk requires verification

May increase setup, validation, or first-run risk for the user.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 15 structured pitfall item(s), including 5 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: high
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/788

2. Installation risk: Installation risk requires verification

  • Severity: high
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/791

3. Configuration risk: Configuration risk requires verification

  • Severity: high
  • Finding: Project evidence flags a configuration risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/794

4. Security or permission risk: Security or permission risk requires verification

  • Severity: high
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/735

5. Security or permission risk: Security or permission risk requires verification

  • Severity: high
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/315

6. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: identity.distribution | https://github.com/yacy/yacy_search_server

7. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/796

8. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | https://github.com/yacy/yacy_search_server

9. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | https://github.com/yacy/yacy_search_server

10. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | https://github.com/yacy/yacy_search_server

11. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | https://github.com/yacy/yacy_search_server

12. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/yacy/yacy_search_server/issues/790

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 11

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using yacy_search_server with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence