Skip to content

Commit 26f409d

Browse files
authored
Merge pull request #118 from JetBrains-Research/readme_update
README update
2 parents d5ee4d1 + 62c95d0 commit 26f409d

File tree

1 file changed

+30
-10
lines changed

1 file changed

+30
-10
lines changed

README.md

Lines changed: 30 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,31 @@ You can easily install model through the PIP:
1414
pip install code2seq
1515
```
1616

17-
## Usage
17+
## Dataset mining
1818

19-
Minimal code example to run the model:
19+
To prepare your own dataset with a storage format supported by this implementation, use on the following:
20+
1. Original dataset preprocessing from vanilla repository
21+
2. [`astminer`](https://github.com/JetBrains-Research/astminer):
22+
the tool for mining path-based representation and more with multiple language support.
23+
3. [`PSIMiner`](https://github.com/JetBrains-Research/psiminer):
24+
the tool for extracting PSI trees from IntelliJ Platform and creating datasets from them.
25+
## Available checkpoints
26+
27+
### Method name prediction
28+
| Dataset (with link) | Checkpoint | # epochs | F1-score | Precision | Recall | ChrF |
29+
|-------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|----------|----------|-----------|--------|-------|
30+
| [Java-small](https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/java-paths-methods/java-small.tar.gz) | [link](https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/checkpoints/code2seq_java_small.ckpt) | 11 | 41.49 | 54.26 | 33.59 | 30.21 |
31+
| [Java-med](https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/java-paths-methods/java-med.tar.gz) | [link](https://s3.eu-west-1.amazonaws.com/datasets.ml.labs.aws.intellij.net/checkpoints/code2seq_java_med.ckpt) | 10 | 48.17 | 58.87 | 40.76 | 42.32 |
32+
33+
## Configuration
34+
35+
The model is fully configurable by standalone YAML file.
36+
Navigate to [config](config) directory to see examples of configs.
37+
38+
## Examples
39+
40+
Model training may be done via PyTorch Lightning trainer.
41+
See it [documentation](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html) for more information.
2042

2143
```python
2244
from argparse import ArgumentParser
@@ -29,20 +51,21 @@ from code2seq.model import Code2Seq
2951

3052

3153
def train(config: DictConfig):
32-
# Load data module
54+
# Define data module
3355
data_module = PathContextDataModule(config.data_folder, config.data)
34-
data_module.prepare_data()
35-
data_module.setup()
3656

37-
# Load model
57+
# Define model
3858
model = Code2Seq(
3959
config.model,
4060
config.optimizer,
4161
data_module.vocabulary,
4262
config.train.teacher_forcing
4363
)
4464

45-
trainer = Trainer(max_epochs=config.hyper_parameters.n_epochs)
65+
# Define hyper parameters
66+
trainer = Trainer(max_epochs=config.train.n_epochs)
67+
68+
# Train model
4669
trainer.fit(model, datamodule=data_module)
4770

4871

@@ -54,6 +77,3 @@ if __name__ == "__main__":
5477
__config = OmegaConf.load(__args.config)
5578
train(__config)
5679
```
57-
58-
Navigate to [config](config) directory to see examples of configs.
59-
If you have any questions, then feel free to open the issue.

0 commit comments

Comments
 (0)