All the papers listed in this project come from my usual reading. If you have found some new and interesting papers, I would appreciate it if you let me know!!!
A Survey on Hallucination in Large Vision-Language Models
Hallucination of Multimodal Large Language Models: A Survey
-
VideoHallucer VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models (Jun. 24, 2024)
-
MOCHa (OpenCHAIR) MOCHa: Multi-Objective Reinforcement Mitigating Caption Hallucinations (Dec. 06, 2023)
-
CCEval HallE-Switch: Controlling Object Hallucination in Large Vision Language Models (Dec. 03, 2023)
-
HallusionBench HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination & Visual Illusion in Large Vision-Language Models (Nov. 28, 2023)Highly recommended
-
HaELM Evaluation and Analysis of Hallucination in Large Vision-Language Models (Oct. 10, 2023)
-
NOPE Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models (Oct. 9, 2023)
-
LRV (GAVIE) Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning (Sep., 29 2023)
-
MMHal-Bench Aligning Large Multimodal Models with Factually Augmented RLHF (Sep. 25, 2023)
-
POPE Evaluating Object Hallucination in Large Vision-Language Models (EMNLP 2023)(object hallucination最常用的benchamark)**Highly recommended
-
CHAIR Object Hallucination in Image Captioning (EMNLP 2018)
-
VHtestVisual Hallucinations of Multi-modal Large Language Models
-
Hal-EvalHal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
-
PhDPhD: A Prompted Visual Hallucination Evaluation DatasetHighly recommended
-
THRONE THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
-
MetaToken MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
- LRV-Instruction: Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning, (Liu et al. ICLR2024)
- LURE: Analyzing and Mitigating Object Hallucination in Large Vision-Language Models, (Zhou et al. ICLR2024)
- HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision-Language Models for Detailed Caption, (Zhai et al. 2023)
- Woodpecker: Hallucination Correction for Multimodal Large Language Models, (Yin et al.)
- LLaVA-RLHF: Aligning Large Multimodal Models with Factually Augmented RLHF, (Sun et al.)
- Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision, (Lee et al.)
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data, (Yu et al.)
- VCD: Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding, (Leng et al.)Highly recommended
- HA-DPO: Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
- Mitigating Hallucination in Visual Language Models with Visual Supervision, (Chen et al.)
- OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation, (Huang et al.)Highly recommended
- FOHE: Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites, (Wang et al.)
- RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
- MOCHa: Multi-Objective Reinforcement Mitigating Caption Hallucinations, (Ben-Kish et al.)
- HACL: Hallucination Augmented Contrastive Learning for Multimodal Large Language Model, (Jiang et al.)
- Silkie: Preference Distillation for Large Visual Language Models, (Li et al.)
- HalluciDoctor:HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data:用“反常规”的数据干掉假相关性
-
- MARINE: Mitigating Object Hallucination in LargeVision-Language Models via Classifier-Free Guidance
- CIEM :CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning
- EFUF: EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models
- SEEING IS BELIEVING:MITIGATING HALLUCINATION IN LARGE VISION-LANGUAGE MODELS VIA CLIP-GUIDED DECODING
- Logical Closed Loop: Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models
- Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
-
- MOF:Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs 三幻神之作,强烈推荐!!!(这篇paper“养”活了好几篇paper.....)
- IBD: IBD: Alleviating Hallucinations in Large Vision-Language Models viaImage-Biased Decoding
- DualFocus:DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
- HALC:HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
- less is more:Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective
- Number Hallucinations:Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
- The First to Know: The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
- Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning
- Pensieve:Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination
- Multi-Modal Hallucination Control by Visual Information Grounding
- What if...?:What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models
- Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
- Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
-
- H2RSVLM:H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model
- FGAIF:FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
- Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models
- BRAVE:BRAVE: Broadening the visual encoding of vision-language models
-
- Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
- LION:LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
- Prescribing the Right Remedy: Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning
- Fact:Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales
- Self-Supervised Visual Preference Alignment
- Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
- VALOR-EVAL: VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models
- RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs
- Mitigating Object Hallucination via Data Augmented Contrastive Tuning *
- Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions *
- Automated Multi-level Preference for MLLMs
- Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
-
- VDGD VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap *
-
- Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks
- Don't Miss the Forest for the Trees Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
- Calibrated Self-Rewarding Vision Language Models *
- VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation
- Song in the AI Ocean:A Survey on Hallucination in Large Language Models https://github.com/HillZhang1999/llm-hallucination-survey
- A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
- A Comprehensive Survey of Hallucination Mitigation Techniques in LargeLanguage Models
- A Survey of Hallucination in “Large” Foundation Models
- Contrastive Decoding: Open-ended Text Generation as Optimization Alleviating
- Alleviating Hallucinations of Large Language Models through Induced Halluctions
- Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
- SH2: Self-Highlighted Hesitation Helps You Decode More Truthfully
- DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
- Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint
- ROSE Doesn’t Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding
- Weak-to-Strong Jailbreaking on Large Language Models
- IBD: Alleviating Hallucinations in Large Vision-Language Models viaImage-Biased Decoding
- HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
- ......
I am immensely grateful to two pivotal projects that have significantly influenced the development of my work: awesome-Large-MultiModal-Hallucination and Awesome-MLLM-Hallucination. The dedication and effort put forth by the contributors of these projects, particularly xieyuquanxx and the team at Show Lab, have provided an indispensable resource for researchers and developers alike. The awesome-Large-MultiModal-Hallucination project has offered a comprehensive guide and a curated list of resources that have been instrumental in shaping my understanding of Large MultiModal Hallucination. Similarly, the Awesome-MLLM-Hallucination repository has been a treasure trove of knowledge, showcasing cutting-edge techniques and methodologies in the realm of Machine Learning and Large Model Hallucination. By sharing their expertise and compiling these resources, they have not only advanced the field but also fostered a spirit of collaboration and open knowledge. I am deeply appreciative of their contributions and am inspired by their commitment to the community. Their work serves as a foundation upon which I have built and expanded, and for that, I extend my heartfelt thanks. This acknowledgment is a small gesture compared to the vast impact their work has had on mine. Thank you for setting a remarkable example for the open-source and scientific communities.