-
Notifications
You must be signed in to change notification settings - Fork 139
Closed
Description
Hi, Thanks for your sharing. I have tried your code on my own dataset but the I found that initially everything goes well but after several epochs the training suddenly broke up ( accuracy becomes 1 and the loss becomes 0 ) I use tf 1.12.0 and the cuda version is 9.0, cudnn version is 7.1.4
# conda list | grep tensorflow
tensorflow-estimator 1.13.0 py_0 anaconda
tensorflow-gpu 1.12.0 pypi_0 pypi
tensorflow-tensorboard 0.4.0 pypi_0 pypi
Have you met this kind of problem? Another potential problem is that sometimes the training takes 4400 MB GPU memory (see from nvidia-smi
), but sometimes it takes more than 7000 MB ( and I do not change the batch size and network architecture) I am pretty confused about these problems. Could you give me some advice?
Metadata
Metadata
Assignees
Labels
No labels