From Method Text to Editable SVG
AutoFigure-edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.
Quick Start • Web Interface • How It Works • Configuration • Citation
| Feature | Description |
|---|---|
| 📝 Text-to-Figure | Generate a draft figure directly from method text. |
| 🧠 SAM3 Icon Detection | Detect icon regions from multiple prompts and merge overlaps. |
| 🎯 Labeled Placeholders | Insert consistent AF-style placeholders for reliable SVG mapping. |
| 🧩 SVG Generation | Produce an editable SVG template aligned to the figure. |
| 🖥️ Embedded Editor | Edit the SVG in-browser using the bundled svg-edit. |
| 📦 Artifact Outputs | Save PNG/SVG outputs and icon crops per run. |
AutoFigure-edit introduces two breakthrough capabilities:
- Fully Editable SVGs (Pure Code Implementation): Unlike raster images, our outputs are structured Vector Graphics (SVG). Every component is editable—text, shapes, and layout can be modified losslessly.
- Style Transfer: The system can mimic the artistic style of reference images provided by the user.
Below are 9 examples covering 3 different papers. Each paper is generated using 3 different reference styles. (Each image shows: Left = AutoFigure Generation | Right = Vectorized Editable SVG)
| Paper & Style Transfer Demonstration |
|---|
CycleResearcher / Style 1![]() |
CycleResearcher / Style 2![]() |
CycleResearcher / Style 3![]() |
DeepReviewer / Style 1![]() |
DeepReviewer / Style 2![]() |
DeepReviewer / Style 3![]() |
DeepScientist / Style 1![]() |
DeepScientist / Style 2![]() |
DeepScientist / Style 3![]() |
The AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:
- Generation (
figure.png): The LLM generates a raster draft based on the method text. - Segmentation (
sam.png): SAM3 detects and segments distinct icons and text regions. - Templating (
template.svg): The system constructs a structural SVG wireframe using placeholders. - Assembly (
final.svg): High-quality cropped icons and vectorized text are injected into the template.
View Detailed Technical Pipeline
AutoFigure2’s pipeline starts from the paper’s method text and first calls a text‑to‑image LLM to render a journal‑style schematic, saved as figure.png. The system then runs SAM3 segmentation on that image using one or more text prompts (e.g., “icon, diagram, arrow”), merges overlapping detections by an IoU‑like threshold, and draws gray‑filled, black‑outlined labeled boxes on the original; this produces both samed.png (the labeled mask overlay) and a structured boxlib.json with coordinates, scores, and prompt sources.
Next, each box is cropped from the original figure and passed through RMBG‑2.0 for background removal, yielding transparent icon assets under icons/*.png and *_nobg.png. With figure.png, samed.png, and boxlib.json as multimodal inputs, the LLM generates a placeholder‑style SVG (template.svg) whose boxes match the labeled regions.
Optionally, the SVG is iteratively refined by an LLM optimizer to better align strokes, layouts, and styles, resulting in optimized_template.svg (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label/ID), producing the assembled final.svg.
Key configuration details:
- Placeholder Mode: Controls how icon boxes are encoded in the prompt (
label,box, ornone). - Optimization:
optimize_iterations=0allows skipping the refinement step to use the raw structure directly.
# 1) Install dependencies
pip install -r requirements.txt
# 2) Install SAM3 separately (not vendored in this repo)
git clone https://github.com/facebookresearch/sam3.git
cd sam3
pip install -e .Run:
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEYpython server.pyThen open http://localhost:8000.
AutoFigure-edit provides a visual web interface designed for seamless generation and editing.
On the start page, paste your paper's method text on the left. On the right, configure your generation settings:
- Provider: Select your LLM provider (OpenRouter or Bianxie).
- Optimize: Set SVG template refinement iterations (recommend
0for standard use). - Reference Image: Upload a target image to enable style transfer.
- SAM3 Backend: Choose local SAM3 or the fal.ai API (API key optional).
The generation result loads directly into an integrated SVG-Edit canvas, allowing for full vector editing.
- Status & Logs: Check real-time progress (top-left) and view detailed execution logs (top-right button).
- Artifacts Drawer: Click the floating button (bottom-right) to expand the Artifacts Panel. This contains all intermediate outputs (icons, SVG templates, etc.). You can drag and drop any artifact directly onto the canvas for custom composition.
AutoFigure-edit depends on SAM3 but does not vendor it. Please follow the official SAM3 installation guide and prerequisites. The upstream repo currently targets Python 3.12+, PyTorch 2.7+, and CUDA 12.6 for GPU builds.
SAM3 checkpoints are hosted on Hugging Face and may require you to request
access and authenticate (e.g., huggingface-cli login) before download.
- SAM3 repo: https://github.com/facebookresearch/sam3
- SAM3 Hugging Face: https://huggingface.co/facebook/sam3
If you prefer not to install SAM3 locally, you can use an API backend (also supported in the Web demo). We recommend using Roboflow as it is free to use.
Option A: fal.ai
export FAL_KEY="your-fal-key"
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEY \
--sam_backend falOption B: Roboflow
export ROBOFLOW_API_KEY="your-roboflow-key"
python autofigure2.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEY \
--sam_backend roboflowOptional CLI flags (API):
--sam_api_key(overridesFAL_KEY/ROBOFLOW_API_KEY)--sam_max_masks(default: 32, fal.ai only)
| Provider | Base URL | Notes |
|---|---|---|
| OpenRouter | openrouter.ai/api/v1 |
Supports Gemini/Claude/others |
| Bianxie | api.bianxie.ai/v1 |
OpenAI-compatible API |
Common CLI flags:
--provider(openrouter | bianxie)--image_model,--svg_model--sam_prompt(comma-separated prompts)--sam_backend(local | fal | roboflow | api)--sam_api_key(API key override; falls back toFAL_KEYorROBOFLOW_API_KEY)--sam_max_masks(fal.ai max masks, default 32)--merge_threshold(0 disables merging)--optimize_iterations(0 disables optimization)--reference_image_path(optional)
Click to expand directory tree
AutoFigure-edit/
├── autofigure2.py # Main pipeline
├── server.py # FastAPI backend
├── requirements.txt
├── web/ # Static frontend
│ ├── index.html
│ ├── canvas.html
│ ├── styles.css
│ ├── app.js
│ └── vendor/svg-edit/ # Embedded SVG editor
└── img/ # README assets
WeChat Discussion Group
Scan the QR code to join our community. If the code is expired, please add WeChat ID nauhcutnil or contact tuchuan@mail.hfut.edu.cn.
![]() |
![]() |
If you find AutoFigure or FigureBench helpful, please cite:
@inproceedings{
zhu2026autofigure,
title={AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations},
author={Minjun Zhu and Zhen Lin and Yixuan Weng and Panzhong Lu and Qiujie Xie and Yifan Wei and Yifan_Wei and Sifan Liu and QiYao Sun and Yue Zhang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=5N3z9JQJKq}
}
@dataset{figurebench2025,
title = {FigureBench: A Benchmark for Automated Scientific Illustration Generation},
author = {WestlakeNLP},
year = {2025},
url = {https://huggingface.co/datasets/WestlakeNLP/FigureBench}
}This project is licensed under the MIT License - see LICENSE for details.














