@@ -20,7 +20,7 @@ python scripts/curate/dataset_ensemble_clone.py
20
20
21
21
> [ !Tip]
22
22
>
23
- > ** Output** : ` repoqa-{datetime}.json ` by adding a ` "content" ` field (path to content) for each repo.
23
+ > ** Output** : ` repoqa-snf- {datetime}.json ` by adding a ` "content" ` field (path to content) for each repo.
24
24
25
25
26
26
### Step 3: Dependency analysis
@@ -45,23 +45,23 @@ python scripts/curate/dep_analysis/{language}.py # python
45
45
### Step 4: Merge step 2 and step 3
46
46
47
47
``` shell
48
- python scripts/curate/merge_dep.py --dataset-path repoqa-{datetime}.json
48
+ python scripts/curate/merge_dep.py --dataset-path repoqa-snf- {datetime}.json
49
49
```
50
50
51
51
> [ !Tip]
52
52
>
53
53
> ** Input** : Download dependency files in to ` scripts/curate/dep_analysis/data ` .
54
54
>
55
- > ** Output** : Update ` repoqa-{datetime}.json ` by adding a ` "dependency" ` field for each repository.
55
+ > ** Output** : Update ` repoqa-snf- {datetime}.json ` by adding a ` "dependency" ` field for each repository.
56
56
57
57
58
58
### Step 5: Function collection with TreeSitter
59
59
60
60
``` shell
61
61
# collect functions (in-place)
62
- python scripts/curate/function_analysis.py --dataset-path repoqa-{datetime}.json
62
+ python scripts/curate/function_analysis.py --dataset-path repoqa-snf- {datetime}.json
63
63
# select needles (in-place)
64
- python scripts/curate/needle_selection.py --dataset-path repoqa-{datetime}.json
64
+ python scripts/curate/needle_selection.py --dataset-path repoqa-snf- {datetime}.json
65
65
```
66
66
67
67
> [ !Tip]
@@ -72,7 +72,7 @@ python scripts/curate/needle_selection.py --dataset-path repoqa-{datetime}.json
72
72
### Step 6: Annotate each function with description to make a final dataset
73
73
74
74
``` shell
75
- python scripts/curate/needle_annotation.py --dataset-path repoqa-{datetime}.json
75
+ python scripts/curate/needle_annotation.py --dataset-path repoqa-snf- {datetime}.json
76
76
```
77
77
78
78
> [ !Tip]
@@ -85,7 +85,7 @@ python scripts/curate/needle_annotation.py --dataset-path repoqa-{datetime}.json
85
85
### Step 7: Merge needle description to the final dataset
86
86
87
87
``` shell
88
- python scripts/curate/merge_annotation.py --dataset-path repoqa-{datetime}.json --annotation-path {output-desc-path}.jsonl
88
+ python scripts/curate/merge_annotation.py --dataset-path repoqa-snf- {datetime}.json --annotation-path {output-desc-path}.jsonl
89
89
```
90
90
91
91
> [ !Tip]
0 commit comments