Training: Less intense GPU usage, longer run time in v0.3.18 versus v0.2.5

Hello:

We observed on identical inputs and parameters that training GPU usage is less intense and run time in v0.3.18 was more than two times as long as in v0.2.5.
```
topaz train --train-images /path/to/image_list_train.txt --train-targets /path/to/topaz_particles_processed_train.txt \
    -s 0 -p 0 --test-images /path/to/image_list_test.txt --test-targets /path/to//topaz_particles_processed_test.txt \
    --num-particles 500 --learning-rate 0.0002 --minibatch-size 128 --num-epochs 10 --method GE-binomial \
    --slack -1.0 --autoencoder 0.0 --l2 0.0 --minibatch-balance 0.0625 --epoch-size 5000 --model resnet8 \
    --units 32 --dropout 0.0 --bn on --unit-scaling 2 --ngf 32 --num-workers 1 \
    --cross-validation-seed 1039026690 --radius 3 --num-particles 500 --device 0 --no-pretrained \
    --save-prefix=/path/to/models/model -o /path/to/train_test_curve.txt
```
With this command, we observed in v0.3 a notification
```
When using GPU to load data, we only load in this process. Setting num_workers = 0.
```
(in case this is related.)
In the [netdata](https://github.com/netdata/netdata) trace of GPU load

<img width="2286" height="1738" alt="Image" src="https://github.com/user-attachments/assets/7a51e89d-634c-48fe-96f1-fb2f8cd1e3f8" />
the narrower, higher plateau corresponds to a training run with v0.2.5. The subsequent wider, shallower plateau corresponds to the equivalent v0.3.0 run.
Is there a combination of parameters that would allow us to replicate in v0.3.18 the speed and approximate results of a v0.2.5 run?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training: Less intense GPU usage, longer run time in v0.3.18 versus v0.2.5 #255

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Training: Less intense GPU usage, longer run time in v0.3.18 versus v0.2.5 #255

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions