Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning bug fix #51

Merged
merged 8 commits into from
Jan 24, 2022
Merged

Fine-tuning bug fix #51

merged 8 commits into from
Jan 24, 2022

Conversation

eu9ene
Copy link
Collaborator

@eu9ene eu9ene commented Jan 14, 2022

Copy link
Contributor

@XapaJIaMnu XapaJIaMnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand this wouldn't affect the behaviour when training crashed and was resumed, correct? That would still continue with loading the optimiser parameters. (TBH I haven't tested if this case even works).

Snakefile Outdated
@@ -91,14 +92,16 @@ align_dir = f"{data_dir}/alignment"

# models
models_dir = f"{data_root_dir}/models/{src}-{trg}/{experiment}"
teacher_dir = f"{models_dir}/teacher"
teacher_all_dir = f"{models_dir}/teacher-all"
teacher_parallel_dir = f"{models_dir}/teacher-parallel"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the source, I don't understand what teacher_parallel_dir should contain. What is a parallel teacher model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teacher all - the model is trained on all available data.
Teacher parallel - optional model to fine-tune on parallel data only if the data was augmented with back translations.

Will it be easier to understand if I rename them to teacher and teacher-finetuned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would be easier to understand, but this is a very minor point.

@eu9ene
Copy link
Collaborator Author

eu9ene commented Jan 18, 2022

As far as I understand this wouldn't affect the behaviour when training crashed and was resumed, correct? That would still continue with loading the optimiser parameters. (TBH I haven't tested if this case even works).

This will work because I removed protection from the output file model.npz.best-chrf.npz. It means snakemake will delete it when the job is stopped or crushed, so it will rerun training next time to get this file. Since it will happen in the same directory and model.npz and optimizer progress files are there, it will continue training.

However, this is an irregular situation and not desirable. The pipeline is designed to work end to end without interruptions.

@eu9ene eu9ene merged commit a4ada6c into mozilla:main Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants