EXIT: Context-Aware Extractive Compression for RAG 🚀

Official implementation of "EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation"

Overview 📋

EXIT is a context-aware extractive compression framework that improves both the effectiveness and efficiency of Retrieval-Augmented Generation (RAG) by:

🎯 Preserving critical information while reducing context size
🔍 Considering full document context when evaluating sentence importance
⚡ Enabling parallelizable, context-aware extraction
🎚️ Adapting dynamically to query complexity
⚖️ Balancing compression ratio and answer accuracy

Installation 💻

# Clone the repository
git clone https://github.com/ThisIsHwang/EXIT.git
cd EXIT

# Create a new conda environment
conda create -n exit python=3.8
conda activate exit

# Install dependencies
pip install -r requirements.txt

# Download spaCy model
python -m spacy download en_core_web_sm

Quickstart 🚀

Here's a simple example demonstrating the EXIT RAG pipeline:

from exit_rag import ExitRAG, Document

# Initialize pipeline
rag = ExitRAG(
    retriever_model="google/gemma-2b-it",
    compression_model="doubleyyh/exit-gemma-2b",
    reader_model="meta-llama/Llama-3.1-8B-Instruct"
)

# Example query and document
query = "How do solid-state drives (SSDs) improve computer performance?"
documents = [Document(
    title="Computer Storage Technologies",
    text="""
    Solid-state drives use flash memory to store data without moving parts.
    Unlike traditional hard drives, SSDs have no mechanical components.
    The absence of physical movement allows for much faster data access speeds.
    I bought my computer last week.
    SSDs significantly reduce boot times and application loading speeds.
    They consume less power and are more reliable than mechanical drives.
    The price of SSDs has decreased significantly in recent years.
    """
)]

# Run RAG pipeline with compression
result = rag.run_rag(query, documents)

# Print results
print("\nQuery:", result["query"])
print("\nCompressed Context:", result["compressed_context"])
print("\nAnswer:", result["answer"])
print(f"\nGeneration Time: {result['generation_time']:.2f}s")

Data Preparation 📚

Download Datasets

You can download the evaluation datasets (NaturalQuestions, TriviaQA, HotpotQA, 2WikiMultiHopQA) from the CompAct repository.

Dataset Structure

Each dataset follows the format:

{

    "question": "How do solid-state drives improve computer performance?",

    "ctxs": [

        {

            "title": "Document Title",

            "text": "Document content...",

            "score": 1.0

        },

        ...

    ]

}

Model Details 🔧

Base Model: Gemma-2b-it
Training Method: PEFT/LoRA
Training Data: HotpotQA dataset with:
- Positive examples: Sentences marked as supporting facts
- Hard negatives: Sentences from same documents but not supporting facts
- Random negatives: Sentences from unrelated documents
Recommended Parameters:
- Compression threshold (tau): 0.5
- Cache directory: Configurable via initialization

Key Features 🌟

Document Compression

compressed_text, selections, scores = rag.compress_documents(
    query=query,
    documents=documents,
    threshold=0.5  # Adjustable compression threshold
)

Answer Generation

answer, generation_time = rag.generate_answer(
    query=query,
    context=compressed_text
)

Complete RAG Pipeline

result = rag.run_rag(
    query=query,
    documents=documents,
    compression_threshold=0.5
)

Performance 📊

EXIT demonstrates superior performance in:

Token count reduction
Answer accuracy preservation
End-to-end latency reduction
Multi-hop question handling

Limitations ⚠️

Currently optimized for English text only
No support for cross-lingual compression
Requires GPU for optimal performance

Citation 📚

If you use EXIT in your research, please cite our paper:

@article{hwang2024exit,
  title={EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation},
  author={Hwang, Taeho and Cho, Sukmin and Jeong, Soyeong and Song, Hoyun and Han, SeungYoon and Park, Jong C.},
  journal={arXiv preprint arXiv:2412.12559},
  year={2024}
}

License 📄

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contact 📧

For questions or issues:

Open an issue in this repository
Contact: doubleyyh@kaist.ac.kr

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
compressors		compressors
train		train
utils/src		utils/src
README.md		README.md
exit_rag.py		exit_rag.py
quickstart.py		quickstart.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EXIT: Context-Aware Extractive Compression for RAG 🚀

Overview 📋

Installation 💻

Quickstart 🚀

Data Preparation 📚

Download Datasets

Dataset Structure

Model Details 🔧

Key Features 🌟

Document Compression

Answer Generation

Complete RAG Pipeline

Performance 📊

Limitations ⚠️

Citation 📚

License 📄

Contact 📧

About

Releases

Packages

Languages

ThisIsHwang/EXIT

Folders and files

Latest commit

History

Repository files navigation

EXIT: Context-Aware Extractive Compression for RAG 🚀

Overview 📋

Installation 💻

Quickstart 🚀

Data Preparation 📚

Download Datasets

Dataset Structure

Model Details 🔧

Key Features 🌟

Document Compression

Answer Generation

Complete RAG Pipeline

Performance 📊

Limitations ⚠️

Citation 📚

License 📄

Contact 📧

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages