Skip to content

Commit c0de43d

Browse files
authored
Merge pull request #275 from guoday/patch-8
Update README.md
2 parents 66ea3e1 + 4d2599e commit c0de43d

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

LongCoder/README.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,15 @@ This repo will provide the code for reproducing the experiments on LCC datasets
77
- pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
88
- pip install --upgrade transformers fuzzywuzzy tree_sitter datasets
99

10-
## 2. Fine-Tune Setting
11-
Here we provide fine-tune settings for code completion on LCC datasets in C# programming language, whose results are reported in the paper.
10+
## 2. Dataset
11+
In this repo, the LCC dataset will be automatically downloaded when running the fine-tuning script. If you want to download LCC datasets by yourself, you can find them in the following links:
12+
```
13+
https://huggingface.co/datasets/microsoft/LCC_python
14+
https://huggingface.co/datasets/microsoft/LCC_java
15+
https://huggingface.co/datasets/microsoft/LCC_csharp
16+
```
17+
## 3. Fine-Tune Setting
18+
Here we provide fine-tune settings for code completion on LCC datasets in C# programming language, whose results are reported in the paper.
1219

1320
Note that it requires 8 v100-32G GPUs, and you can adjust batch size or source length based on your requirements.
1421

@@ -43,7 +50,7 @@ python run.py \
4350
--num_train_epochs $epochs 2>&1| tee $output_dir/train.log
4451
```
4552

46-
## 3. Evaluating LongCoder
53+
## 4. Evaluating LongCoder
4754

4855
```shell
4956
lang=csharp #csharp, python, java

0 commit comments

Comments
 (0)