@@ -14,9 +14,31 @@ You can easily install model through the PIP:
14
14
pip install code2seq
15
15
```
16
16
17
- ## Usage
17
+ ## Dataset mining
18
18
19
- Minimal code example to run the model:
19
+ To prepare your own dataset with a storage format supported by this implementation, use on the following:
20
+ 1 . Original dataset preprocessing from vanilla repository
21
+ 2 . [ ` astminer ` ] ( https://github.com/JetBrains-Research/astminer ) :
22
+ the tool for mining path-based representation and more with multiple language support.
23
+ 3 . [ ` PSIMiner ` ] ( https://github.com/JetBrains-Research/psiminer ) :
24
+ the tool for extracting PSI trees from IntelliJ Platform and creating datasets from them.
25
+ ## Available checkpoints
26
+
27
+ ### Method name prediction
28
+ | Dataset (with link) | Checkpoint | # epochs | F1-score | Precision | Recall | ChrF |
29
+ | -------------------------------------------------------------------------------------------------------------------------| -------------------------------------------------------------------------------------------------------------------| ----------| ----------| -----------| --------| -------|
30
+ | [ Java-small] ( https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/java-paths-methods/java-small.tar.gz ) | [ link] ( https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/checkpoints/code2seq_java_small.ckpt ) | 11 | 41.49 | 54.26 | 33.59 | 30.21 |
31
+ | [ Java-med] ( https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/java-paths-methods/java-med.tar.gz ) | [ link] ( https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/checkpoints/code2seq_java_med.ckpt ) | 10 | 48.17 | 58.87 | 40.76 | 42.32 |
32
+
33
+ ## Configuration
34
+
35
+ The model is fully configurable by standalone YAML file.
36
+ Navigate to [ config] ( config ) directory to see examples of configs.
37
+
38
+ ## Examples
39
+
40
+ Model training may be done via PyTorch Lightning trainer.
41
+ See it [ documentation] ( https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html ) for more information.
20
42
21
43
``` python
22
44
from argparse import ArgumentParser
@@ -29,20 +51,21 @@ from code2seq.model import Code2Seq
29
51
30
52
31
53
def train (config : DictConfig):
32
- # Load data module
54
+ # Define data module
33
55
data_module = PathContextDataModule(config.data_folder, config.data)
34
- data_module.prepare_data()
35
- data_module.setup()
36
56
37
- # Load model
57
+ # Define model
38
58
model = Code2Seq(
39
59
config.model,
40
60
config.optimizer,
41
61
data_module.vocabulary,
42
62
config.train.teacher_forcing
43
63
)
44
64
45
- trainer = Trainer(max_epochs = config.hyper_parameters.n_epochs)
65
+ # Define hyper parameters
66
+ trainer = Trainer(max_epochs = config.train.n_epochs)
67
+
68
+ # Train model
46
69
trainer.fit(model, datamodule = data_module)
47
70
48
71
@@ -54,6 +77,3 @@ if __name__ == "__main__":
54
77
__config = OmegaConf.load(__args.config)
55
78
train(__config)
56
79
```
57
-
58
- Navigate to [ config] ( config ) directory to see examples of configs.
59
- If you have any questions, then feel free to open the issue.
0 commit comments