You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# CERT: Continual Pre-training on Sketches for Library-oriented Code Generation
2
+
3
+
CERT's source code and our crafted evaluation benchmarks.
4
+
5
+
## Installation
6
+
7
+
### For benchmarks installation
8
+
```
9
+
$ unzip human-eval.zip
10
+
$ pip install -e human-eval
11
+
```
12
+
13
+
### For the installation of the CERT runtime environment
14
+
```
15
+
$ pip install -r requirements.txt
16
+
```
17
+
18
+
## Usage
19
+
20
+
### Encoding the cleaned code corpus:
21
+
22
+
- Converting each code file to many code blocks
23
+
- Each code block is converted to code sketch
24
+
- Tokenizing code and converting text to binary file.
25
+
26
+
```
27
+
$ bash scripts/run_encode_domain.sh
28
+
```
29
+
30
+
### Training CERT
31
+
```
32
+
$ bash run_cert.sh
33
+
```
34
+
35
+
### Evaluating CERT
36
+
37
+
Our crafted PandasEval and NumpyEval are placed in human-eval/data.
38
+
39
+
```
40
+
$ bash run_eval_monitor.sh
41
+
```
42
+
43
+
Assign the output file path from the previous step to the POST_PATH variable in run_eval_monitor_step2.sh.
44
+
45
+
```
46
+
$ bash run_eval_monitor_step2.sh
47
+
```
48
+
49
+
50
+
## Citation
51
+
52
+
Please cite using the following bibtex entry:
53
+
54
+
```
55
+
@inproceedings{CERT,
56
+
title={{CERT}: Continual Pre-training on Sketches for Library-oriented Code Generation},
57
+
author={Zan, Daoguang and Chen, Bei and Yang, Dejian and Lin, Zeqi and Kim, Minsu and Guan, Bei and Wang, Yongji and Chen, Weizhu and Lou, Jian-Guang},
58
+
booktitle={The 2022 International Joint Conference on Artificial Intelligence},
0 commit comments