Dokumentation (english)

Nanonets OCR2 3B

Advanced OCR with structured Markdown, semantic tagging, and document feature extraction

Nanonets-OCR2-3B is a 3B parameter vision-language OCR model. Produces structured Markdown or HTML output with semantic tagging. Handles signatures, watermarks, checkboxes, flowcharts, and handwriting. Multilingual support.

When to use:

  • Extracting text from invoices, forms, and scanned documents
  • Preserving document structure (headings, tables, lists) in output
  • Documents containing signatures or watermarks that need to be identified

Input: Image or document file (PNG, JPG, PDF, TIFF) + optional fine-tuned checkpoint Output: Extracted text, formatted output (Markdown/HTML), and OCR metadata

Model Settings

Output Format (default: markdown, required, options: markdown / html) Format of the structured output.

  • markdown: Best for downstream text processing and LLM pipelines
  • html: Best for rendering in a browser or preserving visual layout

Extract Signatures (default: true) Detect and tag signatures in the document.

  • Enable when processing contracts or signed forms
  • Disable if signatures are not relevant to save processing time

Extract Watermarks (default: true) Detect and tag watermarks.

  • Enable for compliance or verification use cases
  • Disable for clean text extraction where watermarks are noise

On this page


Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 2 Stunden
Release: v4.0.0-production
Buildnummer: master@afa25ab
Historie: 72 Items