Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[examples] fixes arguments for summarization finetune scripts #5157

Conversation

ieBoytsov
Copy link
Contributor

@ieBoytsov ieBoytsov commented Jun 20, 2020

This PR fixes arguments in bash scripts for finetuning bart/bart_tiny/t5 models in summarization examples.

After changes that were merged in #4951 i noticed some small problems. That includes:

  • missing default arguments for finetune.py (--data_dir, output_dir)
  • invalid argument name for number of gpu to use (n_gpu instead of gpus)
  • argument model_type that is not present in list of arguments and crashes the script

@ieBoytsov ieBoytsov changed the title [examples] fix arguments for summarization finetune scripts [examples] fixes arguments for summarization finetune scripts#4951 Jun 20, 2020
@ieBoytsov ieBoytsov changed the title [examples] fixes arguments for summarization finetune scripts#4951 [examples] fixes arguments for summarization finetune scripts Jun 20, 2020
@codecov
Copy link

codecov bot commented Jun 20, 2020

Codecov Report

Merging #5157 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #5157   +/-   ##
=======================================
  Coverage   77.93%   77.94%           
=======================================
  Files         137      137           
  Lines       23475    23475           
=======================================
+ Hits        18295    18297    +2     
+ Misses       5180     5178    -2     
Impacted Files Coverage Δ
src/transformers/file_utils.py 75.00% <0.00%> (-0.41%) ⬇️
src/transformers/modeling_tf_utils.py 85.96% <0.00%> (-0.15%) ⬇️
src/transformers/modeling_openai.py 80.90% <0.00%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68e19f1...49b1e42. Read the comment docs.

Copy link
Contributor

@sshleifer sshleifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Would love to not hardcode data_dir and output_dir, otherwise LGTM!

@@ -11,6 +11,8 @@ export PYTHONPATH="../":"${PYTHONPATH}"

python finetune.py \
--model_name_or_path=facebook/bart-large \
--data_dir=./cnn-dailymail/cnn_dm \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this in the README and add a comment in this file that the proper usage is documented in the README. I would rather not hardcode paths to data in this script, since people are using many different datasets stored in different places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, please check my 2nd commit, i also dropped cnn prefix from default output dir in finetune.sh. Maybe it makes sense not to hardcode it as well? Will be glad to hear your feedback.

@sshleifer sshleifer self-assigned this Jun 20, 2020
@ieBoytsov ieBoytsov force-pushed the finetune-arguments-fixes-for-summarization branch from 590e32b to 7f0c46d Compare June 20, 2020 16:47
@ieBoytsov ieBoytsov requested a review from sshleifer June 20, 2020 20:13
Copy link
Contributor

@sshleifer sshleifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to mention OUTPUT_DIR in finetune.sh. Otherwise, LGTM!

examples/summarization/finetune.sh Outdated Show resolved Hide resolved
@sshleifer sshleifer merged commit bc3a0c0 into huggingface:master Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants