📄 S1-Parser: Efficient Multimodal Document Parsing via Dynamic Structure-Semantics Fusion

S1-Parser is a highly efficient multimodal text parsing tool designed to enable accurate and efficient parsing of complex documents. Instead of relying solely on static fine-tuning or single-stage optimization, it employs a strategy of first Supervised Fine-Tuning (SFT) then Reinforcement Learning (RL), effectively fine-tuning the model on critical aspects such as formula syntax correctness, symbol integrity, and structural rationality—balancing parsing precision and efficiency across diverse document types.

📰 News

[2025/10/28] We release the Code for S1-Parser.

🚀 Features

🧩 Supervised Fine-Tuning with task-oriented ([Parse Target: Scientific Equations]) to sharpen domain adaptation.
🎯 Multi-stage RL to refine, stabilize, and accelerate the learning process in strategic of behaviors.
📊 Benchmarked on Scientific Literature Dataset: SCI_LLM

🛢️ Benchmarks

⚙️ Environment Setup (Recommended)

We recommend using Python 3.10 and PyTorch ≥ 2.7.

Install the environment:

# Recommend Python 3.10.18
git clone https://github.com/ScienceOne-AI/S1-Parser.git
cd S1-Parser
pip install -r requirements.txt

🏋️ Training

S1-Parser training proceeds in two stages with different designs:

# Stage 1: Execute the Supervised Fine-Tuning (SFT) to acquire fundamental LaTeX OCR.
bash scripts/run_train_ocr_sft_model.sh

# Stage 2: Execute the GRPO training to optimize LaTeX formula syntax, symbol and structure.
bash scripts/run_train_ocr_grpo_model.sh

Make sure to configure your model paths and data in script/run_train_ocr_*.sh.

🔍 Acknowledgements

We build and reference on the following open source trunks, and thank the following sources for their contributions to the open source community:

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
magic_pdf		magic_pdf
parse		parse
script		script
LICENSE		LICENSE
README.md		README.md
model_configs.yaml		model_configs.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📄 S1-Parser: Efficient Multimodal Document Parsing via Dynamic Structure-Semantics Fusion

📰 News

🚀 Features

🛢️ Benchmarks

⚙️ Environment Setup (Recommended)

🏋️ Training

🔍 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

ScienceOne-AI/S1-Parser

Folders and files

Latest commit

History

Repository files navigation

📄 S1-Parser: Efficient Multimodal Document Parsing via Dynamic Structure-Semantics Fusion

📰 News

🚀 Features

🛢️ Benchmarks

⚙️ Environment Setup (Recommended)

🏋️ Training

🔍 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages