siraaj-dot-ocr-service / docs/research/benchmark_ocr_models/README.md
OCR Model Benchmarks
Last updated: 4/16/2026GitHub
OCR Model Benchmarks
Reports
- Hard English Benchmark — 17 models tested on 9 difficult English pages (tables, infographics, KPI tables, diagrams). Winner: LightOnOCR-2 (1B) at 92.8/100, 2.7s/page, 2 GB VRAM.
- General Model Comparison — 20+ models benchmarked on English + Arabic documents. Covers speed, quality, stability, Arabic support, VRAM, and vLLM compatibility.
Directory Structure
baselines/— ground truth files (9 pages), created by Claude via visual inspection at 150 DPIresults/— raw model outputs organized by benchmark type (hard_english/,arabic_tables/,texts/)benchmark_hard_english.py— main hard English benchmark script (vLLM-compatible models)benchmark_got_ocr_hard.py— standalone GOT-OCR 2.0 benchmark (HuggingFace transformers)benchmark_arabic_tables.py— Arabic table extraction benchmarkbenchmark_table_comparison.py— table-specific comparison across modelsbenchmark_primary_layout.py— DotsOCR layout detection benchmarkbenchmark_secondary_vlm.py— secondary VLM (Qwen) benchmark for Picture cropsbenchmark_stack_comparison.py— full pipeline E2E comparisonrun_hard_english_benchmark.sh— shell runner for sequential model benchmarking