MinerU 2.5 Document Parsing: What It Is and Why It Matters
The previous article introduced MonkeyOCR. In this part, we focus on MinerU 2.5, a document parsing tool/model designed to convert complex documents (especially PDFs) into structured, LLM-ready outputs such as Markdown/JSON, with strong layout understanding for multi-column text, tables, and formulas.
MinerU 2.5 has been officially released by the project and is described as a 1.2B-parameter vision-language model for document parsing, reporting strong results on document benchmarks (e.g., OmniDocBench) in its release materials.
Note on model comparisons: MinerU’s release notes report benchmark advantages over several mainstream VLMs and specialized tools on OmniDocBench, but other independent model releases/papers may report different outcomes under different settings. Treat “best” as benchmark- and setup-dependent.
MinerU 2.5 is especially useful for practical workflows such as building RAG knowledge bases and large-scale document extraction, where preserving layout, tables, and formulas matters more than plain OCR text.
Quick Comparison: MinerU vs OCR-Only Tools
MinerU is not just “OCR.” It is closer to a document parsing pipeline that emphasizes structure + layout + export formats (Markdown/JSON/PDF-like reconstruction), which tends to matter more for downstream search, retrieval, and RAG ingestion than raw text alone.
Step 1: Environment Preparation
1.1 Check CUDA and GPU (Optional but Recommended)
# Check CUDA version (requires CUDA 11.8 or higher)
nvcc --version
# Check GPU status and memory
nvidia-smi
If you do not have an NVIDIA GPU, you can still run MinerU on CPU, but it will be slower.
1.2 Create a Dedicated Conda Environment (Python 3.10)
Option 1: Default path
conda create -n mineru python=3.10
conda activate mineru
Option 2: Custom path (Windows example)
conda create --prefix=D:\Computer\Anaconda\envs\mineru python=3.10
conda activate mineru
Step 2: Install MinerU (GPU/CPU)
2.1 Install MinerU Core
pip install uv
pip uninstall mineru -y
uv pip install -U "mineru[core]" -i https://mirrors.aliyun.com/pypi/simple
MinerU is available as a PyPI package.
2.2 Install PyTorch (GPU users)
Install the PyTorch build that matches your CUDA version. Example for CUDA 12.1:
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
(If you are on a different CUDA version, use the corresponding PyTorch index URL.)
2.3 Verify Installation
mineru --version
mineru --help
Step 3: Download Model Files
MinerU requires model assets. A common approach is downloading all required models.
mineru-models-download --model_type all
(Downloads can be large; if it fails, retry.)
Step 4: Run a Functional Test (Pipeline vs VLM)
4.1 Prepare Test Folders
mkdir test_pdfs
mkdir test_output
Put PDFs into test_pdfs/.
4.2 Pipeline Mode (Faster, Good Default)
# Parse a single PDF
mineru -p ./test_pdfs/your_file.pdf -o ./test_output/ --backend pipeline
# GPU acceleration (if available)
mineru -p ./test_pdfs/your_file.pdf -o ./test_output/ --backend pipeline --device cuda
4.3 VLM Mode (Higher Precision, Slower)
# VLM mode on GPU
mineru -p ./test_pdfs/your_file.pdf -o ./test_output/ --backend vlm-transformers --device cuda
# VLM mode on CPU
mineru -p ./test_pdfs/your_file.pdf -o ./test_output/ --backend vlm-transformers --device cpu
When to choose which
- Pipeline mode: batch parsing, speed-first, stable default.
- VLM mode: maximum fidelity on complex layouts, tables, and formulas—accept slower runtime.
Parsed output commonly includes a .md (Markdown) file representing the extracted structure.
4.4 Batch Processing a Folder
mineru -p ./test_pdfs -o ./test_output/ --backend pipeline --batch-size 8
Step 5: Launch the Web Interface (Gradio)
MinerU provides a browser UI for interactive parsing and review.
5.1 Start the Web Service
conda activate mineru
mineru-gradio --server-port 8080
5.2 Open in Browser
Visit:
http://localhost:8080/
5.3 Troubleshooting
Change port
mineru-gradio --server-port 7860
Windows network reset (admin CMD)
netsh winsock reset
netsh int ip reset
ipconfig /flushdns
Reboot and retry.
Step 6: Online Experience
If you want a quick trial without local setup, MinerU provides an official online experience.
Key Usage Reminders
- Activate the environment
conda activate mineru
- Two usage methods
- CLI mode: best for batch processing and automation
- Web UI mode: best for daily use and manual review
- Performance tips
- Prefer GPU if available
- Prefer pipeline for speed and scale
- Use VLM for maximum accuracy on complex documents