Skip to content

DanceSkyCode/General-Visual-Quality-RL

Repository files navigation

👀✨🖼️Disentangled Reinforcement Learning for Robust Visual Quality Assessment

Project Page Project Page
Zehui Feng, Tian Qiu, Tong Wu, Huayuan Xu, Ting Han*,
Shanghai Jiao Tong University, Zhejiang University * denotes the corresponding author

The first NR-QA model empowered by RL2RS, capable of performing both quality reasoning and rating across IQA and VQA tasks.

fig-genexample

🏁 Method Overview. (a) Existing score/ranking reward function assign minimal difference, which results in distribution fall or robustness fail. (b) PreResIQA-R1 focus on fine-grained response-ranking reward balance and preference. (c) PreResIQA-R1 enables state-of-the-art performance and stable image quality assessment with discriminative reward. (d) typical qualitative and quantitative example comparison between VisualQuality-R1 and PreResIQA-R1, which demonstrates superior performance on image quality describe and score.

Framework

🏁 Overall training framework of PreResIQA-R1 via reinforcement-learning-to-rank-score (RL2RS). Given an image batch with a shared text prompt, PreResIQA-R1 generates K responses. To quickly activate CoT differences and then access generation stability, we introduce the response penalty and fine-grained triplet-response balance reward. To jointly enhance the robustness of ranking and score ability, we introduce the preference pairwise-and-triplet score-and-ranking reward for GRPO.

Framework
🏁 Pipeline of the Preference-Response Disentangled Policy Optimization (PRPO), which applies response ranking response balance reward, and preference pairwise score and ranking reward, and preference triplet ranking reward to optimize group policy learning.

✨ Update

[2025/10/30] 💻💻💻 We release the training and inference code of PreResVQA-R1 on video quality assessment. To extend beyond static imagery, we introduce a global–temporal and local–spatial data flow strategy. With only 28K samples, it achieves state-of-the-art performance across 5 VQA datasets while providing interpretable CoT process.

[2025/10/27] 🤗🤗🤗 We release [PreResIQA-R1-7B] fine-tuned on the Qwen2.5-VL-7B-Instruct.

[2025/10/23] 💻💻💻 We release the training and inference code of PreResIQA-R1 on image quality assessment, a preference–response disentangled reinforcement learning framework that unifies score regression and ranking consistency via reasoning-driven optimization. With only 6K samples, it achieves state-of-the-art performance across 10 IQA datasets while providing interpretable CoT process.

🔧Environment setup

quickly create a conda environment that contains the packages necessary to run our scripts on A100 and A800 GPUs.

conda create -n PreResQ python=3.11
conda activate PreResQ

bash setup.sh

🚀Quick Training and Inference

1.Quick Reinforcement-Learning Fine-Tuning Start

For IQA task:

bash run_scripts\KADID-10K\one_node_run_KADID_PreResIQA_R1.sh
--model_name_or_path [ Qwen2.5-VL-7B-Instruct path] \
--image_folders [dataset images path] \
--data_file_paths [JSON MOS_Ground_Truth file path] \

For VQA task:

bash run_scripts\KADID-10K\one_node_run_LSVQ_PreResVQA_R1.sh
--model_name_or_path [ your PreResIQA-R1 path] \
--image_folders [dataset images path] \
--data_file_paths [JSON MOS_Ground_Truth file path] \

2.quick batch sample inference

For IQA task:

python src\inference_PreResIQA_R1.py
--MODEL_PATH [ PreResIQA-R1_path] \
--image_root_path [ test_image_root_path] \
--output_root_path [ output_root_path]

For IQA task:

python src\inference_PreResVQA_R1.py
--MODEL_PATH [ PreResVQA-R1_path] \
--image_root_path [ test_image_root_path] \
--output_root_path [ output_root_path]

😺Acknowledge

We sincerely thank the following outstanding works and contributors:

  1. Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank. Authors: Tianhe Wu, Jian Zou, Jie Liang, Lei Zhang, Kede Ma.

  2. VLM-R1: A stable and generalizable R1-style Large Vision-Language Model Authors: Haozhan Shen, Peng Liu, Jingcheng Li, Chunxin Fang, Yibo Ma, Jiajia Liao, Qiaoli Shen, Zilun Zhang, Kangjia Zhao, Qianqian Zhang, Ruochen Xu, Tiancheng Zhao


🏷️ License

This repository is released under the MIT license. See LICENSE for additional details.

About

Official Repo of "Disentangled Reinforcement Learning for Robust Visual Quality Assessment"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published