Skip to content

Ryanlijinke/unisvg.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏆 Accepted to ACM MM 2025 Dataset Track!

UniSVG has been officially accepted to the ACM Multimedia 2025 Dataset Track 🎉
🌐 Project Page | 🏆 Conference Website

UniSVG Dataset

UniSVG is a comprehensive dataset designed for unified SVG generation (from textual prompts and images) and SVG understanding (color, category, usage, etc.). It comprises 525k data items tailored for Multi-modal Large Language Models (MLLM) training and evaluation. You can access the dataset on Hugging Face.

UniSVG Example

🔥 Release

[2025/11/27]

  • 🔥 We are glad to announce that our UniSVG benchmark is used by Qwen3-VL!

[2025/09/22]

[2025/07/31]

[2025/06/03]

[2025/05/30]

Project Homepage

For more information, please visit the project homepage.

Dataset Summary

Unlike bitmap images, scalable vector graphics (SVG) maintain quality when scaled, frequently employed in computer vision and artistic design in the representation of SVG code. In this era of proliferating AI-powered systems, enabling AI to understand and generate SVG has become increasingly urgent. However, AI-driven SVG understanding and generation (U&G) remain significant challenges. SVG code, equivalent to a set of curves and lines controlled by floating-point parameters, demands high precision in SVG U&G. Besides, SVG generation operates under diverse conditional constraints, including textual prompts and visual references, which requires powerful multi-modal processing for condition-to-SVG transformation. Recently, the rapid growth of Multi-modal Large Language Models (MLLMs) have demonstrated capabilities to process multi-modal inputs and generate complex vector controlling parameters, suggesting the potential to address SVG U&G tasks within a unified model. To unlock MLLM's capabilities in the SVG area, we propose an SVG-centric dataset called UniSVG, comprising 525k data items, tailored for MLLM training and evaluation. To our best knowledge, it is the first comprehensive dataset designed for unified SVG generation (from textual prompts and images) and SVG understanding (color, category, usage, etc.).

Usage

To install the dataset, you can use the datasets library from Hugging Face:

pip install datasets

Here is an example of how to load and use the dataset:

from datasets import load_dataset

# Load the dataset
UniSVG_dataset = load_dataset("lili24/UniSVG")

# Print the first example
print(UniSVG_dataset[0])

Prompts examples

Data construction prompts examples

Please refer to prompts/data_construction_example.py for detailed information.

Inference prompts examples

Please refer to prompts/Inference_prompts _examples.py for detailed information.

Finetuning example

After downloading our UniSVG dataset, you can use your preferred models to finetune them on UniSVG/subset of UniSVG. We have tried to finetune on the following MLLMs, please feel free to get them: LLaVA 1.5, LLaVA-LLaMA, LLaVA-Next, GLM 4V, LLaMA 3.2, Qwen 2.5 VL.

Then please transfer your downloaded UniSVG dataset into LLaMA-Factory version: Modify and run the following two python files:

# Make sure you modify these files before using them!
python utils/transfer_to_llava.py
python utils/transfer_to_llama_factory.py

As an example, we ultized LLaMA-Factory frame to do the finetuning. We saved one LLaMA-Factory repo here for your easy use

Then please added the modified LLaMA-Factory UniSVG json into "train/qwen25_llama32/LLaMA-Factory/data", and modify the "train/qwen25_llama32/LLaMA-Factory/data/dataset_info.json" by adding (our provided repo already includes this):

"unisvg": {
  "file_name": "llama_UniSVG_train.json",
  "formatting": "sharegpt",
  "columns": {
    "messages": "messages",
    "images": "images"
  },
  "tags": {
    "role_tag": "role",
    "content_tag": "content",
    "user_tag": "user",
    "assistant_tag": "assistant"
  }
}

Congrats! Your UniSVG dataset is finally ready for finetuning! We offer you an example finetuning bash file using deepspeed under LLaMA Factory, please refer to: /train/qwen25_llama32/train.sh

⚠️ Special Warning: If you interested in different stages training of LLaVA 1.5, LLaVA-LLaMA and LLaVA-Next, the LLaMA-Factory may not support, please using the LLaVA repo for finetuning, we also provided the saved LLaVA repo for easy use here.Specially, we modified the training scripts to make it suitable for finetuning LLaVA-LLaMA, for more information, please visit the training scripts here.

Evaluation example

After finetuning, you can edit the inference code for your model and run the inference by:

python infer.py

You will get a inference json file with model answers in it, then please modify and use the evaluation.py to get the final score:

python evaluation.py

Acknowledgement

This repo benefits from LLaMA-Factory and LLaVA, thanks for your great work!

Citation

If you use this dataset in your research, please cite the following paper:

@inproceedings{li2025unisvg,
  title={UniSVG: A Unified Dataset for Vector Graphic Understanding and Generation with Multimodal Large Language Models},
  author={Li, Jinke and Yu, Jiarui and Wei, Chenxing and Dong, Hande and Lin, Qiang and Yang, Liangjing and Wang, Zhicai and Hao, Yanbin},
  booktitle={Proceedings of the 33rd ACM international conference on multimedia},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •