# https://github.com/PaddlePaddle/PaddleOCR Project Manual

Generated at: 2026-06-22 10:38:14 UTC

## Table of Contents

- [Repository Overview and System Architecture](#page-overview)
- [Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)](#page-pipelines)
- [Deployment, SDKs, and Integrations](#page-deployment)
- [Configuration, Training, and Customization](#page-training)

<a id='page-overview'></a>

## Repository Overview and System Architecture

### Related Pages

Related topics: [Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)](#page-pipelines), [Deployment, SDKs, and Integrations](#page-deployment)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)
- [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md)
- [api_sdk/typescript/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)
- [paddleocr-js/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr-js/package.json)
- [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)
- [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md)
- [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)
- [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
</details>

# Repository Overview and System Architecture

## 1. Purpose and Scope

PaddleOCR is a multilingual, production-grade OCR and document-parsing toolkit built on top of PaddlePaddle. As described in the top-level [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md), the project "converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with industry-leading accuracy," and is positioned as the bedrock for RAG and Agentic applications, with 70k+ stars and adoption by projects such as Dify, RAGFlow, and Cherry Studio.

The repository organizes the system into three concentric layers:

1. **Algorithm core** — PP-OCR family, PP-Structure, and the PaddleOCR-VL vision-language models.
2. **Engine layer** — Python, C++, Paddle-Lite, ONNX, and PaddleServing inference stacks.
3. **SDK layer** — first-party clients for Python, TypeScript, Go, and a browser bundle (`paddleocr-js`).

This separation allows the same model zoo to be reused across research notebooks, server-side services, and edge devices.

## 2. Capability Pillars

### 2.1 Scene OCR (PP-OCRv6)

The PP-OCR series is the global multilingual text-spotting flagship. According to the release notes embedded in [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md), **PP-OCRv6** "supports 50 languages with a single unified" model and is the default `text_type: general` pipeline. The C++ reference configuration in [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml) shows the canonical end-to-end recipe: a `DocPreprocessor` sub-pipeline (orientation + unwarping) followed by `TextDetection` (PP-OCRv6_medium_det), `TextLineOrientation`, and `TextRecognition` (PP-OCRv6_medium_rec).

### 2.2 Intelligent Document Parsing (PP-Structure & PaddleOCR-VL)

The repository exposes two complementary document-parsing approaches, both summarized in [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md):

- **PP-StructureV3** — structure-aware conversion into Markdown or JSON, preserving cell-level coordinates.
- **PaddleOCR-VL-1.6 (0.9B)** — a NaViT-style dynamic-resolution VLM fused with ERNIE-4.5-0.3B, achieving 96.3% accuracy on OmniDocBench v1.6 and supporting 109–111 languages depending on minor version.

The structural sub-modules live under `ppstructure/`:

| Sub-module | Path | Purpose | Source |
| --- | --- | --- | --- |
| Layout analysis | `ppstructure/layout/` | Region segmentation (text/title/figure/table) via PP-PicoDet | [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md) |
| Layout recovery | `ppstructure/recovery/` | Restore images/PDFs into editable Word files | [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md) |
| KIE | `ppstructure/kie/` | Key Information Extraction via VI-LayoutXLM (SER + RE) | [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md) |

### 2.3 Community-driven demand

The community evidence makes it clear which capabilities users push the hardest on. Issue [#1048 "Multilingual OCR Development Plan"](https://github.com/PaddlePaddle/PaddleOCR/issues/1048) (72 comments) drove the consolidation toward a single multi-language model in PP-OCRv6. Issue [#1663](https://github.com/PaddlePaddle/PaddleOCR/issues/1663) discusses text-detection cropping padding — the very issue that motivates the `limit_side_len`, `max_side_limit`, and unclip parameters that appear in the C++ YAML above.

## 3. Deployment Topology

The [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md) lists five official deployment schemes: Python inference, C++ inference, PaddleServing, Paddle-Lite (ARM CPU/OpenCL ARM GPU), and Paddle2ONNX. Each scheme consumes the same YAML-driven pipeline definition (see the `pipeline_name: OCR` example in [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)) but is compiled against a different runtime:

- **C++ Inference** — fastest server-side path, uses PaddleInference + TensorRT.
- **Paddle-Lite** — mobile/IoT path documented in [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md), which targets ARM7/ARM8 phones and depends on cross-compilation toolchains.
- **Paddle2ONNX** — produces interoperable ONNX models for non-Paddle runtimes.

```mermaid
flowchart LR
    A[Image / PDF] --> B[DocPreprocessor]
    B --> C[TextDetection]
    C --> D[TextLineOrientation]
    D --> E[TextRecognition]
    E --> F[Structured Output]
    subgraph "Optional post-processing"
    F --> G[PP-StructureV3]
    F --> H[PaddleOCR-VL]
    F --> I[KIE: SER + RE]
    end
```

## 4. SDK and Multi-Language Bindings

The `api_sdk/` directory contains the official server/client SDKs that wrap the HTTP inference API. Per [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md), the supported languages and their source locations are:

| Language | Source location | Notes |
| --- | --- | --- |
| Python | `paddleocr/` | Reference SDK, tested via `pytest tests/api_client/` |
| TypeScript | `api_sdk/typescript/` | Node ≥ 18, built with `tsup`, tested with `vitest` ([package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)) |
| Go | `api_sdk/go/` | Tested via `go test ./...` |

A separate browser-oriented bundle lives in `paddleocr-js/`. Its [package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr-js/package.json) declares `vitest`, `eslint`, and `prettier` tooling, indicating it is a published client library intended for front-end integration with the official API endpoint rather than an in-browser inference runtime.

The SDK layer intentionally decouples clients from model evolution: the server may upgrade from PP-OCRv5 to PP-OCRv6 (as it did between v3.6 and v3.7.0) without breaking the TypeScript or Go clients as long as the JSON contract is preserved. Recent issues such as [#18194](https://github.com/PaddlePaddle/PaddleOCR/issues/18194) (PaddleOCR-VL HPS — `returnMarkdownImages=false` ineffective with default PaddleX 3.6 SDK) confirm that contract drift between the PaddleX inference SDK and the hosted API is an active integration risk worth tracking when pinning versions.

## 5. Configuration Reference (PP-OCR C++ pipeline)

The single most representative configuration in the repo is the C++ inference YAML for PP-OCR. Excerpted from [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml):

| Section | Key | Value | Purpose |
| --- | --- | --- | --- |
| Top level | `text_type` | `general` | Selects the OCR pipeline |
| `DocPreprocessor` | `use_doc_orientation_classify` | `True` | Enables PP-LCNet_x1_0_doc_ori |
| `DocPreprocessor` | `use_doc_unwarping` | `True` | Enables UVDoc |
| `TextDetection` | `model_name` | `PP-OCRv6_medium_det` | Default detector |
| `TextDetection` | `limit_side_len` / `max_side_limit` | `64` / `4000` | Bounds long-side resizing — directly addresses the cropping-padding concern raised in [#1663](https://github.com/PaddlePaddle/PaddleOCR/issues/1663) |
| `TextDetection` | `thresh` / `box_thresh` / `unclip_ratio` | `0.3` / `0.6` / `1.5` | Standard DB++ post-processing |
| `TextRecognition` | `model_name` | `PP-OCRv6_medium_rec` | Default recognizer |
| `TextRecognition` | `batch_size` | `6` | Throughput knob |
| `TextRecognition` | `score_thresh` | `0.0` | Discard low-confidence text |

Every `model_dir: null` entry means PaddleX will resolve the artifact from its model zoo at first run, which is the convention all other YAMLs in the project follow.

## See Also

- [PP-OCRv6 Release Notes (v3.7.0)](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md) — accuracy and multilingual details.
- [PP-StructureV3 & PaddleOCR-VL-1.6](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md) — document-parsing flagship models.
- [Deployment Overview](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md) — Python, C++, Serving, Lite, and ONNX paths.
- [Official API SDKs](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md) — Python / TypeScript / Go clients.
- [KIE Guide](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md) — SER + RE for form understanding.

---

<a id='page-pipelines'></a>

## Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)

### Related Pages

Related topics: [Repository Overview and System Architecture](#page-overview), [Deployment, SDKs, and Integrations](#page-deployment), [Configuration, Training, and Customization](#page-training)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [paddleocr/_pipelines/ocr.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/ocr.py)
- [paddleocr/_pipelines/pp_structurev3.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/pp_structurev3.py)
- [paddleocr/_pipelines/paddleocr_vl.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/paddleocr_vl.py)
- [paddleocr/_pipelines/doc_understanding.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/doc_understanding.py)
- [paddleocr/_pipelines/formula_recognition.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/formula_recognition.py)
- [paddleocr/_pipelines/seal_recognition.py](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr/_pipelines/seal_recognition.py)
- [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)
- [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)
- [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- [ppstructure/table/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/table/README.md)
- [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
- [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)
- [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md)
</details>

# Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)

## 1. Purpose and Scope

PaddleOCR exposes three primary solution families through its Python pipeline layer, each addressing a different class of document-understanding workload:

- **PP-OCRv6** — a fast, multilingual scene-text spotting stack optimized for ~34.5M-parameter inference and 50+ languages in a single unified model (Source: [README.md:0-0]()).
- **PP-StructureV3** — a structure-aware converter that turns complex PDFs and images into Markdown or JSON with fine-grained coordinates (table cells, text blocks) (Source: [README.md:0-0]()).
- **PaddleOCR-VL (0.9B)** — the flagship vision-language model for document parsing, achieving 96.3% on OmniDocBench v1.6 with structured Markdown/JSON output (Source: [README.md:0-0]()).

The implementation surface is the `paddleocr/_pipelines/` package, which contains thin orchestration classes — `ocr.py`, `pp_structurev3.py`, `paddleocr_vl.py`, plus auxiliary modules such as `doc_understanding.py`, `formula_recognition.py`, and `seal_recognition.py`. These pipelines share a common input/output contract so users can switch between them without rewriting client code.

## 2. Pipeline Architecture

The three pipelines are complementary rather than overlapping. The relationship is shown below.

```mermaid
graph LR
    A[Image / PDF Input] --> B{Use case}
    B -->|Scene text| C[PP-OCRv6]
    B -->|Structured PDF / layout| D[PP-StructureV3]
    B -->|VLM parsing| E[PaddleOCR-VL]
    C --> F[Text + boxes]
    D --> G[Markdown / JSON + cells]
    E --> H[Markdown / JSON elements]
    D -. layout .-> I[ppstructure/layout]
    D -. table .-> J[ppstructure/table]
    D -. recovery .-> K[ppstructure/recovery]
    D -. KIE .-> L[ppstructure/kie]
```

`PP-OCRv6` is the high-throughput path for plain text extraction. `PP-StructureV3` composes four PP-Structure subsystems — layout analysis (Source: [ppstructure/layout/README.md:0-0]()), table recognition (Source: [ppstructure/table/README.md:0-0]()), layout recovery (Source: [ppstructure/recovery/README.md:0-0]()), and Key Information Extraction (Source: [ppstructure/kie/README.md:0-0]()) — to produce document-level Markdown/JSON with explicit cell and block coordinates. `PaddleOCR-VL` collapses detection, recognition, layout, table, and formula tasks into a single end-to-end model when maximum accuracy on irregular layouts is required.

## 3. PP-OCRv6 Configuration

The canonical C++ configuration mirrors the Python pipeline and exposes every module name as a tunable parameter (Source: [deploy/cpp_infer/src/configs/OCR.yaml:0-0]()).

| Module | Default model | Key knobs |
| --- | --- | --- |
| DocOrientationClassify | `PP-LCNet_x1_0_doc_ori` | toggled via `use_doc_preprocessor` |
| DocUnwarping | `UVDoc` | toggled via `use_doc_preprocessor` |
| TextDetection | `PP-OCRv6_medium_det` | `thresh`, `box_thresh`, `unclip_ratio`, `limit_side_len` |
| TextLineOrientation | `PP-LCNet_x1_0_textline_ori` | `use_textline_orientation`, `batch_size` |
| TextRecognition | `PP-OCRv6_medium_rec` | `score_thresh`, `batch_size` |

Two top-level flags control the doc preprocessor and textline orientation stages, so a deployment can disable orientation handling for already-clean scans without removing the YAML keys. The same composition is reflected in `paddleocr/_pipelines/ocr.py`, which is the Python entry point exposed to users (Source: [paddleocr/_pipelines/ocr.py:0-0]()). A common production failure mode reported by the community is empty recognition output when the preprocessor strips content that the detector expects — tightening `box_thresh` and `unclip_ratio`, or disabling `use_textline_orientation`, is the documented workaround (cf. community issue: "图片识别没有文字输出", #17974).

## 4. PP-StructureV3 and Auxiliary Pipelines

`pp_structurev3.py` is the orchestrator that wires the four `ppstructure/*` submodules into a single end-to-end document-parsing call (Source: [paddleocr/_pipelines/pp_structurev3.py:0-0]()). Its main inputs are an image or PDF directory, model directories for layout/table/KIE, and dictionary paths; outputs are Markdown plus an HTML table string and per-element JSON (Source: [ppstructure/recovery/README.md:0-0]()).

Specialized pipelines complement it:

- `doc_understanding.py` — language-model-based semantic parsing of detected regions.
- `formula_recognition.py` — converts mathematical expressions to LaTeX.
- `seal_recognition.py` — handles stamp / seal text extraction, a capability highlighted in the v3.4.0 release notes (Source: [README.md:0-0]()).

KIE is built on top of LayoutXLM and VI-LayoutXLM, supporting Semantic Entity Recognition (SER) and Relation Extraction (RE), and integrates the PP-OCR inference engine for OCR preprocessing (Source: [ppstructure/kie/README.md:0-0]()). On the Chinese XFUND benchmark, `VI-LayoutXLM` reaches 93.19% Hmean at 15.49 ms / image (Source: [ppstructure/kie/README.md:0-0]()).

## 5. PaddleOCR-VL and the v3.7.0 Stack

`paddleocr_vl.py` wraps the PaddleOCR-VL-0.9B model, which combines a NaViT-style dynamic-resolution visual encoder with the ERNIE-4.5-0.3B language model to handle text, tables, formulas, and charts in 109+ languages (Source: [README.md:0-0]()). The model is the recommended default when users need unified element recognition without per-task model switching.

The v3.7.0 release notes (June 2026) highlight that **PP-OCRv6** now achieves +4.6% detection and +5.1% recognition improvements over PP-OCRv5_server while "surpassing mainstream VLMs (Qwen3-VL-235B, GPT-5.5) with only 34.5M parameters" — a positioning explicitly aimed at users who previously assumed VLMs were always superior (Source: [README.md:0-0]()). A known incompatibility is that `returnMarkdownImages=false` is currently ineffective under the default PaddleX 3.6 SDK (community issue #18194), so callers relying on HPS output must pin a compatible SDK version until the bug is closed.

## 6. Deployment and SDK Surface

PaddleOCR ships multiple runtime targets so the same pipeline can be reached from different stacks (Source: [deploy/README.md:0-0]()):

- **Python inference** via the `paddleocr` package.
- **C++ inference** configured through `deploy/cpp_infer/src/configs/OCR.yaml` (Source: [deploy/cpp_infer/src/configs/OCR.yaml:0-0]()).
- **HubServing** exposing nine service modules on ports 8865–8872, including `ocr_det`, `ocr_cls`, `ocr_rec`, `ocr_system`, `structure_table`, `structure_system`, `structure_layout`, `kie_ser`, and `kie_ser_re` (Source: [deploy/hubserving/readme.md:0-0]()).
- **Official API SDKs** in Python, TypeScript (`api_sdk/typescript/package.json`, Node ≥ 18, Apache-2.0), and Go (Source: [api_sdk/README.md:0-0]() and [api_sdk/typescript/package.json:0-0]()).

When choosing among the three core pipelines, the practical rule of thumb is: use **PP-OCRv6** when speed and language breadth matter most; use **PP-StructureV3** when downstream consumers need cell-level coordinates, recoverable Word output, or KIE; use **PaddleOCR-VL** when document layouts are highly irregular (skewed, warped, photographed) and structured Markdown is the primary deliverable (Source: [README.md:0-0]()).

## See Also

- PP-OCR Deployment Guide: [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)
- Layout Analysis Module: [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- Table Recognition Module: [ppstructure/table/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/table/README.md)
- Layout Recovery Module: [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- Key Information Extraction: [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
- Official API SDKs: [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md)
- HubServing Module Reference: [deploy/hubserving/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/hubserving/readme.md)

---

<a id='page-deployment'></a>

## Deployment, SDKs, and Integrations

### Related Pages

Related topics: [Repository Overview and System Architecture](#page-overview), [Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)](#page-pipelines)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)
- [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)
- [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md)
- [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md)
- [api_sdk/typescript/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)
- [paddleocr-js/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr-js/package.json)
- [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
- [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- [deploy/android_demo/app/src/main/cpp/ocr_clipper.hpp](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/android_demo/app/src/main/cpp/ocr_clipper.hpp)
- [deploy/android_demo/app/src/main/cpp/ocr_clipper.cpp](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/android_demo/app/src/main/cpp/ocr_clipper.cpp)
</details>

# Deployment, SDKs, and Integrations

## 1. Overview

PaddleOCR is a multilingual, document-parsing OCR toolkit that converts PDFs and images into structured, LLM-ready Markdown or JSON. Beyond its core inference engines, the project ships a layered deployment and integration surface that targets three audiences: server-side integrators who need REST or gRPC serving, application developers who consume Python/TypeScript/Go/JavaScript SDKs, and edge/mobile teams that deploy via Paddle-Lite or native Android. Source: [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md).

The repository organizes this surface into five concrete sub-trees: `deploy/` for native and serving targets, `api_sdk/` for the official PaddleOCR Cloud API client packages, `paddleocr-js/` for the browser-oriented JavaScript client, `ppstructure/` for downstream document-AI modules, and `deploy/android_demo/` for the on-device Android sample.

## 2. Server-Side Deployment

### 2.1 Paddle Deployment Matrix

PaddleOCR supports a range of server-side deployment options through the `deploy/` directory. According to [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md), the supported schemes are:

| Deployment Target | Use Case | Source Path |
| --- | --- | --- |
| Python inference | Quick prototyping, batch scripts | `doc/doc_en/inference_ppocr_en.md` |
| C++ inference | High-throughput production servers | `deploy/cpp_infer/readme.md` |
| Paddle Serving (Python/C++) | REST/gRPC microservice | `deploy/pdserving/README.md` |
| Paddle2ONNX | Export to ONNX for cross-framework use | `deploy/paddle2onnx/readme.md` |
| Paddle-Lite | ARM CPU / OpenCL ARM GPU | `deploy/lite/readme.md` |

The deployment overview explicitly notes that the PaddlePaddle runtime "provides a variety of deployment schemes to meet the deployment requirements of different scenarios" and refers users to the diagram at `../doc/deployment_en.png` for selection guidance. Source: [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md).

### 2.2 Paddle-Lite Mobile Path

For on-device deployment, [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md) describes a two-phase flow: (1) prepare a cross-compilation environment (Docker, Linux, or other supported toolchains) and a Paddle-Lite toolchain, then (2) optimize the inference model with Paddle-Lite's converter and run the resulting model on an ARM7/ARM8 phone. Paddle-Lite itself is positioned as "a lightweight inference engine for PaddlePaddle" that targets mobile and IoT form factors, supporting cross-platform hardware acceleration. Source: [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md).

## 3. Official API SDKs

### 3.1 Multi-Language Client Packages

The `api_sdk/` directory hosts the first-party SDKs that wrap the hosted PaddleOCR Cloud API. The package locations are summarized in [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md):

| Language | Source Location | User Docs |
| --- | --- | --- |
| Python | `../paddleocr` | `docs/version3.x/inference_deployment/serving/paddleocr_official_api/python.md` |
| TypeScript | `api_sdk/typescript` | `docs/version3.x/inference_deployment/serving/paddleocr_official_api/typescript.md` |
| Go | `api_sdk/go` | `docs/version3.x/inference_deployment/serving/paddleocr_official_api/go.md` |

Each language binding is validated through its own test runner: `python -m pytest tests/api_client/`, `npm run lint && npm test` for TypeScript, and `go test ./...` for Go. Source: [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md).

### 3.2 TypeScript and JavaScript Build Profiles

The TypeScript SDK is built with `tsup` and typed against `@types/node ^25.9.1` on Node `>=18`, with `vitest` as its test runner. It targets the `paddleocr` keyword space covering `ocr`, `document-parsing`, `api-sdk`, `typescript`, and `official-api`. Source: [api_sdk/typescript/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json).

The browser-oriented `paddleocr-js/` package uses `vitest ^3.2.4` for testing, `eslint` with `typescript-eslint ^8.57.2` for linting, and `prettier ^3.8.1` for formatting, with `lint-staged` configured to run `eslint --fix` and `prettier --write` on staged files. Source: [paddleocr-js/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr-js/package.json).

## 4. Edge and Mobile: Android Demo

The Android sample under `deploy/android_demo/` ships a native C++ pipeline that performs polygon clipping for text-region processing. The C++ source wraps a translated Delphi Clipper library, exposed via `ocr_clipper.hpp` with the namespace `ClipperLib` and version string `CLIPPER_VERSION "6.4.2"`. Source: [deploy/android_demo/app/src/main/cpp/ocr_clipper.hpp](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/android_demo/app/src/main/cpp/ocr_clipper.hpp).

The companion `ocr_clipper.cpp` defines the supporting scanline data structures (`TEdge`, `IntPoint`), winding rules (`ctIntersection`, `ctUnion`, `ctDifference`, `ctXor`), and constants such as `pi = 3.141592653589793238` and `def_arc_tolerance = 0.25`. These primitives are the geometric foundation that the on-device pipeline uses to merge, intersect, or offset text polygons before recognition. Source: [deploy/android_demo/app/src/main/cpp/ocr_clipper.cpp](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/android_demo/app/src/main/cpp/ocr_clipper.cpp).

## 5. PP-Structure Downstream Modules

The `ppstructure/` tree extends PaddleOCR into document-AI workflows and is tightly coupled to deployment, since the same pipelines can be served through the Python inference or C++ paths.

- **Layout analysis** provides Chinese, English, and table-region detection built on PaddleDetection's PP-PicoDet. Models are available in `ppstructure/docs/models_list_en.md`, and the README documents the PubLayNet and CDLA pre-training data download commands. Source: [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md).
- **Key Information Extraction (KIE)** combines text detection, text recognition, semantic entity recognition (SER), and optional relationship extraction (RE) on top of the VI-LayoutXLM backbone, with pretrained models published in `configs/kie/layoutlm_series/`. Source: [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md).
- **Layout recovery** offers two strategies for restoring an editable Word file: a `pdf2docx`-based path for standard PDFs and an image-format PDF path that combines layout analysis, table recognition, and rule-based parsing. Source: [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md).

## 6. Ecosystem Integrations

PaddleOCR is consumed by several top-tier open-source projects; the README badges list RAGFlow (deep document understanding), Pathway (real-time analytics and LLM pipelines), MinerU (multi-type document to Markdown), Umi-OCR (batch offline OCR), Cherry Studio (multi-LLM desktop client), and Haystack (deepset's RAG framework). These integrations typically consume the Python wheel directly or the PaddleOCR-VL/PP-OCRv6 model checkpoints, depending on the host project's deployment shape. Source: [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md).

## 7. Common Failure Modes

Community-reported issues that intersect with the deployment and SDK surface include:

- **PaddleOCR-VL HPS option ignored on PaddleX 3.6**: `returnMarkdownImages=false` does not take effect with the default PaddleX 3.6 SDK, requiring either a SDK upgrade or a workaround. Source: [Issue #18194](https://github.com/PaddlePaddle/PaddleOCR/issues/18194).
- **No text output for image input**: Symptom of misconfigured detection or recognition parameters at the SDK or serving layer. Source: [Issue #17974](https://github.com/PaddlePaddle/PaddleOCR/issues/17974).
- **Windows + torch compatibility**: `OSError [WinError 127]` when installing torch on Windows, which is a prerequisite for some PaddleOCR-VL pipelines. Source: [Issue #14979](https://github.com/PaddlePaddle/PaddleOCR/issues/14979).
- **Detection crop padding sensitivity**: Long detection crops with large surrounding padding (≈5 px) degrade recognition; a tighter 1–2 px bounding box via OpenCV post-processing is the community-recommended workaround. Source: [Issue #1663](https://github.com/PaddlePaddle/PaddleOCR/issues/1663).

## 8. See Also

- [PP-OCRv6 / PaddleOCR-VL Model Overview](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)
- [PP-Structure Layout Analysis](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- [PP-Structure KIE](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
- [PP-Structure Layout Recovery](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- [PaddleOCR TypeScript SDK](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)

---

<a id='page-training'></a>

## Configuration, Training, and Customization

### Related Pages

Related topics: [Core Pipelines and Models (PP-OCR, PP-StructureV3, PaddleOCR-VL)](#page-pipelines), [Deployment, SDKs, and Integrations](#page-deployment)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)
- [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)
- [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)
- [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md)
- [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)
- [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)
- [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)
- [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md)
- [api_sdk/typescript/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)
- [paddleocr-js/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/paddleocr-js/package.json)
</details>

# Configuration, Training, and Customization

## Overview and Scope

PaddleOCR is a multilingual OCR and document-parsing toolkit that ships a layered configuration and training system. Users can adopt pretrained models out of the box, or retrain and customize virtually every component — text detection, recognition, layout analysis, table recognition, key information extraction (KIE), and VLM-based parsing — to fit domain-specific data. The customization surface is exposed through three primary channels: YAML pipeline definitions, configuration files for individual modules, and per-language scripts under `ppstructure/` [Source: [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)].

The project supports PP-OCRv6, PaddleOCR-VL, and PP-StructureV3 as headline models, and provides unified configuration paths for them. Customization typically follows a "config first, then train, then deploy" pattern.

## Pipeline Configuration

PaddleOCR's production pipeline is described by a single YAML file that maps model names, module names, and hyperparameters. The canonical example is the C++ inference configuration [Source: [deploy/cpp_infer/src/configs/OCR.yaml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)]:

```yaml
pipeline_name: OCR
text_type: general
use_doc_preprocessor: True
use_textline_orientation: True

SubPipelines:
  DocPreprocessor:
    pipeline_name: doc_preprocessor
    use_doc_orientation_classify: True
    use_doc_unwarping: True
    SubModules:
      DocOrientationClassify:
        module_name: doc_text_orientation
        model_name: PP-LCNet_x1_0_doc_ori
      DocUnwarping:
        module_name: image_unwarping
        model_name: UVDoc

SubModules:
  TextDetection:
    module_name: text_detection
    model_name: PP-OCRv6_medium_det
    limit_side_len: 64
    limit_type: min
    thresh: 0.3
    box_thresh: 0.6
    unclip_ratio: 1.5
  TextRecognition:
    module_name: text_recognition
    model_name: PP-OCRv6_medium_rec
    batch_size: 6
    score_thresh: 0.0
```

Key configuration patterns observed in the YAML:

| Field | Purpose | Example Value |
| --- | --- | --- |
| `pipeline_name` | Declares the high-level pipeline | `OCR`, `doc_preprocessor` |
| `use_doc_preprocessor` | Toggles orientation classification + unwarping | `True` |
| `model_name` | Selects a pretrained model checkpoint | `PP-OCRv6_medium_det` |
| `module_name` | Maps a model to its runtime module | `text_detection` |
| `limit_side_len` / `thresh` / `box_thresh` | Detection hyper-parameters | `64`, `0.3`, `0.6` |
| `unclip_ratio` | Expansion ratio for detected polygons | `1.5` |
| `batch_size` / `score_thresh` | Recognition throughput and confidence gate | `6`, `0.0` |

Swapping `model_name` is the primary way to switch between server, mobile, and multilingual variants. Setting `model_dir: null` defers model resolution to the runtime, while a populated `model_dir` overrides the default download [Source: [deploy/cpp_infer/src/configs/OCR.yaml:1-39](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/cpp_infer/src/configs/OCR.yaml)].

```mermaid
flowchart LR
    A[YAML Pipeline] --> B[DocPreprocessor]
    B --> C[TextDetection]
    C --> D[TextLineOrientation]
    D --> E[TextRecognition]
    E --> F[Structured Output]
    G[Custom model_dir] --> C
    G --> E
```

## Training Workflows

PaddleOCR exposes a uniform "download pretrained weights → prepare data → train → export → infer" loop. Each sub-module follows it.

**Layout Analysis.** Training relies on PaddleDetection's PP-PicoDet backbone. The repository documents pretrained downloads such as `picodet_lcnet_x1_0_fgd_layout.pdparams` for the PubLayNet dataset, and notes that Chinese CDLA and table-specific variants exist for other document types. FGD distillation is supported for accuracy improvements [Source: [ppstructure/layout/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/layout/README.md)].

**Key Information Extraction (KIE).** The KIE pipeline extends layout analysis with semantic entity recognition (SER) and relationship extraction (RE). The repository ships LayoutXLM and VI-LayoutXLM configurations under `configs/kie/`, with a `re_layoutxlm_xfund_zh.yml` example reported at 74.83% accuracy. Customization paths include UDML knowledge distillation and textline sorting to fit reading order [Source: [ppstructure/kie/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/kie/README.md)].

**Layout Recovery.** For PDF-to-Word recovery, two custom strategies are available: a rule-based `pdf2docx` path for standard PDFs, and an image-driven path that combines layout analysis, table recognition, and unwarping for image-based PDFs. Users can choose between them based on input format [Source: [ppstructure/recovery/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/ppstructure/recovery/README.md)].

## Customization and Deployment Surfaces

Beyond core training, PaddleOCR is customizable along several axes:

- **Multilingual switching.** A single PP-OCRv6 model supports 50 languages (Chinese, English, Japanese, and 46 Latin-script languages), removing the need to swap checkpoints per locale [Source: [README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/README.md)].
- **VLM parsing.** PaddleOCR-VL integrates a NaViT-style visual encoder with ERNIE-4.5-0.3B. PaddleOCR-VL-1.5 reaches 94.5% on OmniDocBench, supports 111 languages, and adds PP-DocLayoutV3 for irregular layouts (skew, warping, scanning, illumination, screen photography).
- **Deployment targets.** Customization extends to deployment: Python inference, C++ inference (`deploy/cpp_infer`), Paddle Serving, Paddle-Lite for ARM/OpenCL, and Paddle2ONNX for cross-framework export [Source: [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md)].
- **Mobile deployment.** Paddle-Lite requires cross-compilation toolchains, then Paddle-Lite's model optimization, and finally a phone-side runner. The documentation walks through each step in [Source: [deploy/lite/readme.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/lite/readme.md)].
- **API SDKs.** Official SDKs in Python, TypeScript, and Go enable service integration. The TypeScript SDK requires Node ≥ 18 and bundles tsup/vitest tooling [Source: [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md), [api_sdk/typescript/package.json](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/typescript/package.json)].

## Common Failure Modes from the Community

Two patterns from community discussions are worth flagging when customizing:

- **Border/whitespace sensitivity in recognition.** Issue #1663 reports that when detection crops carry wide (≈5px) borders, recognition accuracy degrades noticeably compared to tight 1–2px crops, because training data was synthesized with tight borders. The proposed mitigation is to post-process detected crops (e.g., re-crop to a tight bounding rectangle) before recognition.
- **Silent recognition failures.** Issue #17974 documents cases where images yield no text output, often traced to pipeline configuration (e.g., `use_textline_orientation` disabled, aggressive `score_thresh`, or an inappropriate `limit_side_len` for tiny text). Verifying the YAML and lowering thresholds typically restores output.
- **SDK/HPS parameter drift.** Issue #18194 reports that the PaddleOCR-VL HPS option `returnMarkdownImages=false` is ignored under the default PaddleX 3.6 SDK, illustrating that SDK-side configuration must be validated against the installed runtime, not just the latest docs.

## See Also

- PaddleOCR-VL and PaddleOCR-VL-1.5 release notes — flagship VLM-based document parsing
- PP-OCRv6 architecture — unified multilingual OCR engine
- PP-StructureV3 — structure-aware Markdown/JSON conversion with cell-level coordinates
- [deploy/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/deploy/README.md) — deployment options matrix
- [api_sdk/README.md](https://github.com/PaddlePaddle/PaddleOCR/blob/main/api_sdk/README.md) — multi-language SDK layout

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: PaddlePaddle/PaddleOCR

Summary: Found 14 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

## 1. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this installation risk before relying on the project: Link Checker Report
- User impact: Developers may fail before the first successful local run: Link Checker Report
- Evidence: failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18134, failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18131, failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18126, failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18122, failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18103

## 2. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/17974

## 3. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/PaddlePaddle/PaddleOCR/issues/18157

## 4. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/PaddlePaddle/PaddleOCR/issues/18194

## 5. Installation risk - Installation risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/PaddlePaddle/PaddleOCR/issues/17974

## 6. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: Link Checker Report
- User impact: Developers may misconfigure credentials, environment, or host setup: Link Checker Report
- Evidence: failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18157

## 7. Configuration risk - Configuration risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Developers should check this configuration risk before relying on the project: PaddleOCR-VL HPS: returnMarkdownImages=false is ineffective with default PaddleX 3.6 SDK
- User impact: Developers may misconfigure credentials, environment, or host setup: PaddleOCR-VL HPS: returnMarkdownImages=false is ineffective with default PaddleX 3.6 SDK
- Evidence: failure_mode_cluster:github_issue | https://github.com/PaddlePaddle/PaddleOCR/issues/18194

## 8. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | https://github.com/PaddlePaddle/PaddleOCR

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/PaddlePaddle/PaddleOCR

## 10. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | https://github.com/PaddlePaddle/PaddleOCR

## 11. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | https://github.com/PaddlePaddle/PaddleOCR

## 12. Runtime risk - Runtime risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: Developers should check this performance risk before relying on the project: v3.7.0
- User impact: Upgrade or migration may change expected behavior: v3.7.0
- Evidence: failure_mode_cluster:github_release | https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v3.7.0

## 13. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/PaddlePaddle/PaddleOCR

## 14. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | https://github.com/PaddlePaddle/PaddleOCR

<!-- canonical_name: PaddlePaddle/PaddleOCR; human_manual_source: deepwiki_human_wiki -->
