-
Notifications
You must be signed in to change notification settings - Fork 26.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError while using --optimize_on_cpu #23
Comments
Thanks! I pushed a fix for that, you can try it again. You should be able to increase a bit the batch size. By the way, the real batch size that is used on the gpu is The recommended batch_size to get good results (EM, F1) with BERT large on SQuaD is
If your GPU supports fp16, the last solution should be the fastest, otherwise the second should be the fastest. The first solution should work out-of-the box and give better results (EM, F1) but you won't have any speed-up. |
Should be fixed now. Don't hesitate to re-open an issue if needed. Thanks for the feedback! |
Yes it works now! With
I get {"exact_match": 83.78429517502366, "f1": 90.75733469379139} which is pretty close. Thanks for this amazing work! |
add quac codalab submission pipeline (cont.)
…v2_lysandre Fix initialization test
…en` logits processor (Fixes huggingface#23)
Summary: This pull request adds a debug option to print activation sharding information. To use it, just pass--spmd_debug. Test Plan: Tested locally.
Command:
CUDA_VISIBLE_DEVICES=0 python ./run_squad.py
--vocab_file bert_large/uncased_L-24_H-1024_A-16/vocab.txt
--bert_config_file bert_large/uncased_L-24_H-1024_A-16/bert_config.json
--init_checkpoint bert_large/uncased_L-24_H-1024_A-16/pytorch_model.bin
--do_lower_case
--do_train
--do_predict
--train_file squad_dir/train-v1.1.json
--predict_file squad_dir/dev-v1.1.json
--learning_rate 3e-5
--num_train_epochs 2
--max_seq_length 384
--doc_stride 128
--output_dir outputs
--train_batch_size 4
--gradient_accumulation_steps 2
--optimize_on_cpu
Error while using --optimize_on_cpu only.
Works fine without the argument.
GPU: Nvidia GTX 1080Ti Single GPU.
PS: I can only fit in train_batch_size 4 on the memory of a single GPU.
The text was updated successfully, but these errors were encountered: