Elahe Khatibi*, Ziyu Wang*, Amir M. Rahmani
University of California, Irvine
📄 arXiv:2504.12560
* Equal contribution
Retrieval-Augmented Generation (RAG) has significantly enhanced large language models (LLMs) in knowledge-intensive tasks by incorporating external knowledge retrieval. However, existing RAG frameworks primarily rely on semantic similarity and correlation-driven retrieval, limiting their ability to distinguish true causal relationships from spurious associations. This results in responses that may be factually grounded but fail to establish cause-and-effect mechanisms, leading to incomplete or misleading insights. To address this issue, we introduce Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (CDF-RAG), a framework designed to improve causal consistency, factual accuracy, and explainability in generative reasoning. CDF-RAG iteratively refines queries, retrieves structured causal graphs, and enables multi-hop causal reasoning across interconnected knowledge sources. Additionally, it validates responses against causal pathways, ensuring logically coherent and factually grounded outputs. We evaluate CDF-RAG on four diverse datasets, demonstrating its ability to improve response accuracy and causal correctness over existing RAG-based methods.
CDF-RAG/
├── agents/ # Core modules (retrieval, rewriting, RL, hallucination detection)
├── data/ # Dataset processing scripts and Jupyter notebooks
├── eval/ # Evaluation metrics and scoring scripts
├── finetune/ # Scripts for fine-tuning LLaMA, Mistral, FLAN-T5
├── main/ # Pipeline runner and ablation experiments
├── figs/ # Key figures from paper (use .png for rendering)
├── config.py # Centralized configuration & env variables
├── requirements.txt # Python dependencies
└── README.md # You're here
git clone https://github.com/elakhatibi/CDF-RAG.git
cd CDF-RAG
pip install -r requirements.txt
Create a .env
file or export these:
OPENAI_API_KEY=your-openai-key
PINECONE_API_KEY=your-pinecone-key
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
The CDF-RAG framework is composed of:
- RL-based query refinement (PPO)
- Dual-path document retrieval using:
- Semantic similarity from Pinecone
- Causal graphs from Neo4j
- Knowledge rewriting via GPT
- Hallucination detection & correction using LLM verification
We report the main ablation results of CDF-RAG below. For full quantitative results across 4 datasets and 4 LLMs, refer to the paper.
Ablation Stage | CRC | CCD | SRS | Groundedness | HR | F1 |
---|---|---|---|---|---|---|
Baseline RAG | 0.74 | 1.50 | 0.55 | 0.52 | 0.18 | 0.68 |
+ RL-based Query Refinement | 0.80 | 1.70 | 0.62 | 0.59 | 0.14 | 0.74 |
+ Causal Graph | 0.84 | 1.92 | 0.65 | 0.63 | 0.12 | 0.78 |
+ Rewriter | 0.88 | 2.00 | 0.70 | 0.68 | 0.08 | 0.82 |
+ Hallucination Correction (Ours) | 0.89 | 2.02 | 0.74 | 0.71 | 0.07 | 0.86 |
CRC
: Causal Retrieval CoverageCCD
: Causal Chain DepthSRS
: Semantic Refinement ScoreGroundedness
: Final answer ↔ retrieved docsHR
: Hallucination RateF1
,Precision
,Recall
: (if hallucination labels available)
python main/cdf_rag_pipeline.py "Why do patients miss appointments?"
python main/evaluate_batch_queries.py
python eval/evaluate_metrics.py
python main/Ablation_study.py
To fine-tune LLaMA, Mistral, or FLAN-T5:
cd finetune/
python finetune_mistral_qlora.py # or flan/llama variant
All models are instruction-tuned using the same format (instruction/input/output) on causal QA refinement data.
@misc{khatibi2025cdfrag,
title={CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation},
author={Elahe Khatibi and Ziyu Wang and Amir M. Rahmani},
year={2025},
eprint={2504.12560},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
For questions or collaborations, feel free to reach out:
- 💬 Elahe Khatibi — ekhatibi@uci.edu
- 💬 Ziyu Wang — ziyuw31@uci.edu
This work was supported by the HealthSciTech group at UC Irvine. We thank the LangGraph and open-source LLM communities for inspiration and tooling.