9
9
## 🚀 Installation
10
10
11
11
``` bash
12
- pip install repoqa
12
+ # without vLLM (can run openai, anthropic, and huggingface backends)
13
+ pip install --upgrade repoqa
14
+ # with vLLM
15
+ pip install --upgrade " repoqa[vllm]"
13
16
```
14
17
15
18
<details ><summary >⏬ Install nightly version <i >:: click to expand ::</i ></summary >
16
19
<div >
17
20
18
21
``` bash
19
- pip install " git+https://github.com/evalplus/repoqa.git" --upgrade
22
+ pip install --upgrade " git+https://github.com/evalplus/repoqa.git" # without vLLM
23
+ pip install --upgrade " repoqa[vllm] @ git+https://github.com/evalplus/repoqa@main" # with vLLM
20
24
```
21
25
22
26
</div >
@@ -35,46 +39,55 @@ pip install -r requirements.txt
35
39
</div >
36
40
</details >
37
41
38
-
39
42
## 🏁 Search Needle Function
40
43
41
- ### Inference with vLLM
44
+ ### Inference with OpenAI Compatible Servers
42
45
43
46
``` bash
44
- repoqa.search_needle_function --model " Qwen/CodeQwen1.5-7B-Chat" --caching --backend vllm
47
+ repoqa.search_needle_function --model " gpt4-turbo" --caching --backend openai
48
+ # 💡 If you use customized server such vLLM:
49
+ # repoqa.search_needle_function --base-url "http://url.to.vllm.server/v1" \
50
+ # --model "gpt4-turbo" --caching --backend openai
45
51
```
46
52
47
- ### Inference with OpenAI Compatible Servers
53
+ ### Inference with Anthropic Compatible Servers
54
+
55
+ ``` bash
56
+ repoqa.search_needle_function --model " claude-3-haiku-20240307" --caching --backend anthropic
57
+ ```
58
+
59
+ ### Inference with vLLM
48
60
49
61
``` bash
50
- repoqa.search_needle_function --base-url " http://api.openai.com/v1 " \
51
- --model " gpt4-turbo " -- caching --backend openai
62
+ repoqa.search_needle_function --model " Qwen/CodeQwen1.5-7B-Chat " \
63
+ --caching --backend vllm
52
64
```
53
65
54
66
### Inference with HuggingFace transformers
55
67
56
68
``` bash
57
- repoqa.search_needle_function --model " gpt2" " Qwen/CodeQwen1.5-7B-Chat" --caching --backend hf
69
+ repoqa.search_needle_function --model " gpt2" " Qwen/CodeQwen1.5-7B-Chat" \
70
+ --caching --backend hf --trust-remote-code
58
71
```
59
72
60
73
### Usage
61
74
62
75
> [ !Tip]
63
76
>
64
- > * ** Input** :
65
- > * ` --model ` : Hugging-Face model ID, such as ` ise-uiuc/Magicoder-S-DS-6.7B `
66
- > * ` --backend ` : ` vllm ` (default) or ` openai `
67
- > * ` --base-url ` : OpenAI API base URL
68
- > * ` --code-context-size ` (default: 16384): Number of tokens (using DeepSeekCoder tokenizer) of code in the long context
69
- > * ` --caching ` (default: False): if enabled, the tokenization and chuncking results will be cached to accelerate subsequent runs
70
- > * ` --max-new-tokens ` (default: 1024): Maximum number of new tokens to generate
71
- > * ` --system-message ` (default: None): if given, the model use a system message (but note some models don't support system message)
72
- > * ` --tensor-parallel-size ` : Number of tensor parallelism (only for vLLM)
73
- > * ` --languages ` (default: None): List of languages to evaluate (None means all)
74
- > * ` --result-dir ` (default: "results"): Directory to save the model outputs and evaluation results
75
- > * ** Output** :
76
- > * ` results/ntoken_{code-context-size}/{model}.jsonl ` : Model generated outputs
77
- > * ` results/ntoken_{code-context-size}/{model}-SCORE.json ` : Evaluation scores (also see [ Compute Scores] ( #compute-scores ) )
77
+ > - ** Input** :
78
+ > - ` --model ` : Hugging-Face model ID, such as ` ise-uiuc/Magicoder-S-DS-6.7B `
79
+ > - ` --backend ` : ` vllm ` (default) or ` openai `
80
+ > - ` --base-url ` : OpenAI API base URL
81
+ > - ` --code-context-size ` (default: 16384): Number of tokens (using DeepSeekCoder tokenizer) of code in the long context
82
+ > - ` --caching ` (default: False): if enabled, the tokenization and chuncking results will be cached to accelerate subsequent runs
83
+ > - ` --max-new-tokens ` (default: 1024): Maximum number of new tokens to generate
84
+ > - ` --system-message ` (default: None): if given, the model use a system message (but note some models don't support system message)
85
+ > - ` --tensor-parallel-size ` : Number of tensor parallelism (only for vLLM)
86
+ > - ` --languages ` (default: None): List of languages to evaluate (None means all)
87
+ > - ` --result-dir ` (default: "results"): Directory to save the model outputs and evaluation results
88
+ > - ** Output** :
89
+ > - ` results/ntoken_{code-context-size}/{model}.jsonl ` : Model generated outputs
90
+ > - ` results/ntoken_{code-context-size}/{model}-SCORE.json ` : Evaluation scores (also see [ Compute Scores] ( #compute-scores ) )
78
91
79
92
### Compute Scores
80
93
@@ -87,12 +100,11 @@ repoqa.compute_score --model-output-path={model-output}.jsonl
87
100
88
101
> [ !Tip]
89
102
>
90
- > * ** Input** : Path to the model generated outputs.
91
- > * ** Output** : The evaluation scores would be stored in ` {model-output}-SCORES.json `
92
-
103
+ > - ** Input** : Path to the model generated outputs.
104
+ > - ** Output** : The evaluation scores would be stored in ` {model-output}-SCORES.json `
93
105
94
106
## 📚 Read More
95
107
96
- * [ RepoQA Homepage] ( https://evalplus.github.io/repoqa.html )
97
- * [ RepoQA Dataset Curation] ( docs/curate_dataset.md )
98
- * [ RepoQA Development Notes] ( docs/dev_note.md )
108
+ - [ RepoQA Homepage] ( https://evalplus.github.io/repoqa.html )
109
+ - [ RepoQA Dataset Curation] ( docs/curate_dataset.md )
110
+ - [ RepoQA Development Notes] ( docs/dev_note.md )
0 commit comments