-
Notifications
You must be signed in to change notification settings - Fork 30k
Description
transformers/src/transformers/training_args.py
Line 1339 in 82eb67e
dataloader_persistent_workers: bool = field( |
As described in the documentation, setting this configuration to True
will cause training speedup but will cause more RAM usage, and the default is set to True.
I believe this configuration should default to True
for two reasons:
1- Recreation of workers while training, bottlenecks the GPU and causes significant slowdown, potentially doubling the training time (benchmarked on a single A100 fine-tuning whisper-large-v3)
2- In case practitioners want to mitigate this slowdown, which is visible by the square-wave pattern in the GPU utilization, they have to go through a lot of configuration and perform many hours of time-consuming tests
I request changing the default value of dataloader_persistent_workers
to True
If the maintainers of this repo are OK with this decision i will submit the PR promptly