Dokumentation (english)

PaddleOCR-VL

Vision-language OCR with handwriting support, table-to-HTML, and prompt-based extraction

PaddleOCR-VL is a vision-language OCR model with prompting capabilities. Handles handwriting, old documents, and complex layouts. Converts tables and charts to HTML and extracts embedded images directly.

When to use:

  • Documents with handwriting mixed with printed text
  • Tables and structured forms that need HTML-format output
  • Prompt-guided extraction of specific fields from documents

Input: Image or document file (PNG, JPG, PDF, TIFF) + optional fine-tuned checkpoint Output: Extracted text, formatted output, and metadata

Model Settings

Output Format (default: markdown, required, options: markdown / json / html) Format of the OCR output.

  • markdown: Clean text with structure — best for NLP pipelines
  • json: Structured key-value output — best for programmatic field extraction
  • html: Full layout-preserving HTML — best for visual rendering

Detect Handwriting (default: true) Enable specialized handwriting recognition.

  • Enable for any document that may contain handwritten text
  • Disable for purely printed documents to improve speed

Convert Tables to HTML (default: true) Convert detected tables into HTML table elements.

  • Enable when table structure needs to be preserved
  • Disable for plain text extraction where table structure is not needed

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 2 Stunden
Release: v4.0.0-production
Buildnummer: master@afa25ab
Historie: 72 Items