Closed
Description
System Info
transformers
version: 4.30.1- Platform: Linux-5.7.19-050719-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.15.1
- Safetensors version: 0.3.1
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: no
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
I don't have a full example that I can share, but I think this is a simple enough problem that one may not be needed.
I am using TrainingArguments(auto_find_batch_size=True, eval_steps=0.1, per_device_train_size=1024)
. With batch size of 1024, I have 657 steps. The eval ratio appears to be evaluated on this, with evaluation happening every 66 steps.
However, the automatic batch size adjusts to 16, and a corresponding 83787 steps. But the evaluation is still performed every 66 steps.
Expected behavior
I expected the eval steps to be recomputed when the batch size updated. In the example above, I expected evaluation to occur every ~8000 steps.
Metadata
Metadata
Assignees
Labels
No labels