mDPO: Conditional Preference Optimization for Multimodal Large Language Models (published at EMNLP 2024).
🌐 Homepage | 📖 Paper | 💻 Code | 🤗 Dataset
- 🔥 [2024-09-04] Initial release of the mDPO trainer. We are currently working on releasing the code for training and evaluating different models.
TBD
Our training data is available at this link.
To train Bunny with mDPO, use the following command:
python bunny/run_mdpo_bunny.py
TBD
Please cite the following paper if you find the repo helpful:
@article{wang2024mdpo,
title={mDPO: Conditional Preference Optimization for Multimodal Large Language Models},
author={Wang, Fei and Zhou, Wenxuan and Huang, James Y and Xu, Nan and Zhang, Sheng and Poon, Hoifung and Chen, Muhao},
journal={arXiv preprint arXiv:2406.11839},
year={2024}
}