-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Due to this error, my TensorBoard shows nothing.
My computer spec:
CPU: AMD R7-7700
RAM: 16GB X 2
GPU: RTX 4090 24GB
My system spec:
OS: Windows 11
IDE: VS Code
Python: 3.9.13
PyTorch: 1.13.1+cu117
In:
#run('p2ch11.prepcache.LunaPrepCacheApp') # I run this line successfully. Just to say that I have run this line.
run('p2ch11.training.LunaTrainingApp', '--epochs=1') # I run this line in failure.
Out:
Details
2023-03-14 21:40:08,556 INFO pid:10608 nb:004:run Running: p2ch11.training.LunaTrainingApp(['--epochs=1', '--num-workers=8']).main()
2023-03-14 21:40:08,560 INFO pid:10608 p2ch11.training:079:initModel Using CUDA; 1 devices.
2023-03-14 21:40:08,563 INFO pid:10608 p2ch11.training:138:main Starting LunaTrainingApp, Namespace(num_workers=8, batch_size=1024, epochs=1, tb_prefix='p2ch11', comment='dwlpt')
2023-03-14 21:40:08,724 INFO pid:10608 p2ch11.dsets:182:init <p2ch11.dsets.LunaDataset object at 0x000001EC667766A0>: 495958 training samples
2023-03-14 21:40:08,744 INFO pid:10608 p2ch11.dsets:182:init <p2ch11.dsets.LunaDataset object at 0x000001EC7679AFA0>: 55107 validation samples
2023-03-14 21:40:08,745 INFO pid:10608 p2ch11.training:145:main Epoch 1 of 1, 485/54 batches of size 1024*1
2023-03-14 21:40:08,746 WARNING pid:10608 util.util:144:enumerateWithEstimate E1 Training ----/485, starting
2023-03-14 21:41:21,363 INFO pid:10608 util.util:161:enumerateWithEstimate E1 Training 64/485, done at 2023-03-14 21:47:11, 0:06:37
2023-03-14 21:44:03,276 INFO pid:10608 util.util:161:enumerateWithEstimate E1 Training 256/485, done at 2023-03-14 21:47:15, 0:06:41
2023-03-14 21:47:16,901 WARNING pid:10608 util.util:174:enumerateWithEstimate E1 Training ----/485, done at 2023-03-14 21:47:16
2023-03-14 21:47:18,692 INFO pid:10608 p2ch11.training:259:logMetrics E1 LunaTrainingApp
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:289:logMetrics E1 trn 0.0235 loss, 99.7% correct,
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:298:logMetrics E1 trn_neg 0.0041 loss, 100.0% correct (494577 of 494743)
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:309:logMetrics E1 trn_pos 7.9111 loss, 0.2% correct (2 of 1215)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 run('p2ch11.training.LunaTrainingApp', '--epochs=1')
Cell In[2], line 7, in run(app, *argv)
4 log.info("Running: {}({!r}).main()".format(app, argv))
6 app_cls = importstr(*app.rsplit('.', 1))
----> 7 app_cls(argv).main()
9 log.info("Finished: {}.{!r}).main()".format(app, argv))
File c:\DeepLearning_F1388\F1388_Code\p2ch11\training.py:155, in LunaTrainingApp.main(self)
145 log.info("Epoch {} of {}, {}/{} batches of size {}*{}".format(
146 epoch_ndx,
147 self.cli_args.epochs,
(...)
151 (torch.cuda.device_count() if self.use_cuda else 1),
152 ))
154 trnMetrics_t = self.doTraining(epoch_ndx, train_dl)
--> 155 self.logMetrics(epoch_ndx, 'trn', trnMetrics_t)
157 valMetrics_t = self.doValidation(epoch_ndx, val_dl)
158 self.logMetrics(epoch_ndx, 'val', valMetrics_t)
File c:\DeepLearning_F1388\F1388_Code\p2ch11\training.py:339, in LunaTrainingApp.logMetrics(self, epoch_ndx, mode_str, metrics_t, classificationThreshold)
336 posHist_mask = posLabel_mask & (metrics_t[METRICS_PRED_NDX] < 0.99)
...
--> 386 cum_counts = np.cumsum(np.greater(counts, 0, dtype=np.int32))
387 start, end = np.searchsorted(cum_counts, [0, cum_counts[-1] - 1], side="right")
388 start = int(start)
TypeError: No loop matching the specified signature and casting was found for ufunc greater