Name	Name	Last commit message	Last commit date
Latest commit History 92 Commits
docs	docs
repoqa	repoqa
results	results
scripts	scripts
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
LICENSE	LICENSE
README.md	README.md
pyproject.toml	pyproject.toml
requirements.txt	requirements.txt
setup.cfg	setup.cfg

RepoQA: Evaluating Long-Context Code Understanding

🚀 Installation • 🏁 Search Needle Function • 📚 Read More

🚀 Installation

# without vLLM (can run openai, anthropic, and huggingface backends)
pip install --upgrade repoqa
# with vLLM
pip install --upgrade "repoqa[vllm]"

⏬ Install nightly version :: click to expand ::

pip install --upgrade "git+https://github.com/evalplus/repoqa.git"                 # without vLLM
pip install --upgrade "repoqa[vllm] @ git+https://github.com/evalplus/repoqa@main" # with vLLM

⏬ Using RepoQA as a local repo? :: click to expand ::

git clone https://github.com/evalplus/repoqa.git
cd repoqa
export PYTHONPATH=$PYTHONPATH:$(pwd)
pip install -r requirements.txt

🏁 Search Needle Function

Inference with OpenAI Compatible Servers

repoqa.search_needle_function --model "gpt4-turbo" --caching --backend openai
# 💡 If you use customized server such vLLM:
# repoqa.search_needle_function --base-url "http://url.to.vllm.server/v1" \
#                               --model "gpt4-turbo" --caching --backend openai

Inference with Anthropic Compatible Servers

repoqa.search_needle_function --model "claude-3-haiku-20240307" --caching --backend anthropic

Inference with vLLM

repoqa.search_needle_function --model "Qwen/CodeQwen1.5-7B-Chat" \
                              --caching --backend vllm

Inference with HuggingFace transformers

repoqa.search_needle_function --model "gpt2" "Qwen/CodeQwen1.5-7B-Chat" \
                              --caching --backend hf --trust-remote-code

Usage

Tip

Input:
- --model: Hugging-Face model ID, such as ise-uiuc/Magicoder-S-DS-6.7B
- --backend: vllm (default) or openai
- --base-url: OpenAI API base URL
- --code-context-size (default: 16384): Number of tokens (using DeepSeekCoder tokenizer) of code in the long context
- --caching (default: False): if enabled, the tokenization and chuncking results will be cached to accelerate subsequent runs
- --max-new-tokens (default: 1024): Maximum number of new tokens to generate
- --system-message (default: None): if given, the model use a system message (but note some models don't support system message)
- --tensor-parallel-size: Number of tensor parallelism (only for vLLM)
- --languages (default: None): List of languages to evaluate (None means all)
- --result-dir (default: "results"): Directory to save the model outputs and evaluation results
Output:
- results/ntoken_{code-context-size}/{model}.jsonl: Model generated outputs
- results/ntoken_{code-context-size}/{model}-SCORE.json: Evaluation scores (also see Compute Scores)

Compute Scores

By default, the repoqa.search_needle_function command will also compute scores after producing model outputs. However, you can also compute scores separately using the following command:

repoqa.compute_score --model-output-path={model-output}.jsonl

Tip

Input: Path to the model generated outputs.
Output: The evaluation scores would be stored in {model-output}-SCORES.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RepoQA: Evaluating Long-Context Code Understanding

🚀 Installation

🏁 Search Needle Function

Inference with OpenAI Compatible Servers

Inference with Anthropic Compatible Servers

Inference with vLLM

Inference with HuggingFace transformers

Usage

Compute Scores

📚 Read More

About

Uh oh!

Releases 7

Uh oh!

Contributors 6

Uh oh!

Languages

License

evalplus/repoqa

Folders and files

Latest commit

History

Repository files navigation

RepoQA: Evaluating Long-Context Code Understanding

🚀 Installation

🏁 Search Needle Function

Inference with OpenAI Compatible Servers

Inference with Anthropic Compatible Servers

Inference with vLLM

Inference with HuggingFace transformers

Usage

Compute Scores

📚 Read More

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Uh oh!

Contributors 6

Uh oh!

Languages