[P] Nanonets-OCR2: An Open-Source Image-to-Markdown Model with LaTeX, Tables, flowcharts, handwritten docs, checkboxes & More

By skyforbes Nov 28, 2025 No Comments

We're excited to share Nanonets-OCR2, a state-of-the-art suite of models designed for advanced image-to-markdown conversion and Visual Question Answering (VQA).

🔍 Key Features:

LaTeX Equation Recognition: Automatically converts mathematical equations and formulas into properly formatted LaTeX syntax. It distinguishes between inline ( $...$ ) and display ($$...$$) equations.
Intelligent Image escription: escribes images within documents using structured <img> tags, making them digestible for LLM processing. It can describe various image types, including logos, charts, graphs and so on, detailing their content, style, and context.
Signature etection & Isolation: Identifies and isolates signatures from other text, outputting them within a <signature> tag. This is crucial for processing legal and business documents.
Watermark Extraction: etects and extracts watermark text from documents, placing it within a <watermark> tag.
Smart Checkbox Handling: Converts form checkboxes and radio buttons into standardized Unicode symbols (☐, ☑, ☒) for consistent and reliable processing.
Complex Table Extraction: Accurately extracts complex tables from documents and converts them into both markdown and HTML table formats.
Flow charts & Organisational charts: Extracts flow charts and organisational as mermaid code.
Handwritten ocuments: The model is trained on handwritten documents across multiple languages.
Multilingual: Model is trained on documents of multiple languages, including English, Chinese, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Arabic, and many more.
Visual Question Answering (VQA): The model is designed to provide the answer directly if it is present in the document; otherwise, it responds with "Not mentioned."

🖥️ Live emo

📢 Blog

⌨️ GitHub

🤗 Huggingface models

ocument with equation

ocument with complex checkboxes

Quarterly Report (Please use the Markdown(Financial ocs) for best result in docstrange demo)

Signatures

mermaid code for flowchart

Visual Question Answering

Feel free to try it out and share your feedback.

By skyforbes

MachineLearning

[R][D] A Quiet Bias in DL’s Building Blocks with Big Consequences

skyforbes Nov 28, 2025

MachineLearning

[R] Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

skyforbes Nov 28, 2025

MachineLearning

[R] Plain English outperforms JSON for LLM tool calling: +18pp accuracy, -70% variance

skyforbes Nov 28, 2025

[P] Nanonets-OCR2: An Open-Source Image-to-Markdown Model with LaTeX, Tables, flowcharts, handwritten docs, checkboxes & More

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

Can you figure out the prompt from the output? (Reverse-engineering)?

How to use Gemini Pro for Vibe coding in VS Code or JetBrains?

Xi Joins Trump in Skipping G-20 Summit in Blow to South Africa

TIL future actor Michael Clarke Duncan participated in the 1979 Disco Demolition Night at Comiskey Park in Chicago, where he slid into third base, had a silver belt buckle stolen, and stole a bat from the dugout.

Archives

[P] Nanonets-OCR2: An Open-Source Image-to-Markdown Model with LaTeX, Tables, flowcharts, handwritten docs, checkboxes & More

Like this:

By skyforbes

Related Posts

[R][D] A Quiet Bias in DL’s Building Blocks with Big Consequences

[R] Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

[R] Plain English outperforms JSON for LLM tool calling: +18pp accuracy, -70% variance

Leave a ReplyCancel reply

You Missed

Can you figure out the prompt from the output? (Reverse-engineering)?

How to use Gemini Pro for Vibe coding in VS Code or JetBrains?

Xi Joins Trump in Skipping G-20 Summit in Blow to South Africa

TIL future actor Michael Clarke Duncan participated in the 1979 Disco Demolition Night at Comiskey Park in Chicago, where he slid into third base, had a silver belt buckle stolen, and stole a bat from the dugout.