Skip to content

p2ch11/training.py causes an TypeError #107

@Va6lue

Description

@Va6lue

Due to this error, my TensorBoard shows nothing.

My computer spec:
CPU: AMD R7-7700
RAM: 16GB X 2
GPU: RTX 4090 24GB

My system spec:
OS: Windows 11
IDE: VS Code
Python: 3.9.13
PyTorch: 1.13.1+cu117

In:

#run('p2ch11.prepcache.LunaPrepCacheApp')  # I run this line successfully. Just to say that I have run this line.
run('p2ch11.training.LunaTrainingApp', '--epochs=1')  # I run this line in failure.

Out:

Details

2023-03-14 21:40:08,556 INFO pid:10608 nb:004:run Running: p2ch11.training.LunaTrainingApp(['--epochs=1', '--num-workers=8']).main()
2023-03-14 21:40:08,560 INFO pid:10608 p2ch11.training:079:initModel Using CUDA; 1 devices.
2023-03-14 21:40:08,563 INFO pid:10608 p2ch11.training:138:main Starting LunaTrainingApp, Namespace(num_workers=8, batch_size=1024, epochs=1, tb_prefix='p2ch11', comment='dwlpt')
2023-03-14 21:40:08,724 INFO pid:10608 p2ch11.dsets:182:init <p2ch11.dsets.LunaDataset object at 0x000001EC667766A0>: 495958 training samples
2023-03-14 21:40:08,744 INFO pid:10608 p2ch11.dsets:182:init <p2ch11.dsets.LunaDataset object at 0x000001EC7679AFA0>: 55107 validation samples
2023-03-14 21:40:08,745 INFO pid:10608 p2ch11.training:145:main Epoch 1 of 1, 485/54 batches of size 1024*1
2023-03-14 21:40:08,746 WARNING pid:10608 util.util:144:enumerateWithEstimate E1 Training ----/485, starting
2023-03-14 21:41:21,363 INFO pid:10608 util.util:161:enumerateWithEstimate E1 Training 64/485, done at 2023-03-14 21:47:11, 0:06:37
2023-03-14 21:44:03,276 INFO pid:10608 util.util:161:enumerateWithEstimate E1 Training 256/485, done at 2023-03-14 21:47:15, 0:06:41
2023-03-14 21:47:16,901 WARNING pid:10608 util.util:174:enumerateWithEstimate E1 Training ----/485, done at 2023-03-14 21:47:16
2023-03-14 21:47:18,692 INFO pid:10608 p2ch11.training:259:logMetrics E1 LunaTrainingApp
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:289:logMetrics E1 trn 0.0235 loss, 99.7% correct,
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:298:logMetrics E1 trn_neg 0.0041 loss, 100.0% correct (494577 of 494743)
2023-03-14 21:47:18,698 INFO pid:10608 p2ch11.training:309:logMetrics E1 trn_pos 7.9111 loss, 0.2% correct (2 of 1215)

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 run('p2ch11.training.LunaTrainingApp', '--epochs=1')

Cell In[2], line 7, in run(app, *argv)
4 log.info("Running: {}({!r}).main()".format(app, argv))
6 app_cls = importstr(*app.rsplit('.', 1))
----> 7 app_cls(argv).main()
9 log.info("Finished: {}.{!r}).main()".format(app, argv))

File c:\DeepLearning_F1388\F1388_Code\p2ch11\training.py:155, in LunaTrainingApp.main(self)
145 log.info("Epoch {} of {}, {}/{} batches of size {}*{}".format(
146 epoch_ndx,
147 self.cli_args.epochs,
(...)
151 (torch.cuda.device_count() if self.use_cuda else 1),
152 ))
154 trnMetrics_t = self.doTraining(epoch_ndx, train_dl)
--> 155 self.logMetrics(epoch_ndx, 'trn', trnMetrics_t)
157 valMetrics_t = self.doValidation(epoch_ndx, val_dl)
158 self.logMetrics(epoch_ndx, 'val', valMetrics_t)

File c:\DeepLearning_F1388\F1388_Code\p2ch11\training.py:339, in LunaTrainingApp.logMetrics(self, epoch_ndx, mode_str, metrics_t, classificationThreshold)
336 posHist_mask = posLabel_mask & (metrics_t[METRICS_PRED_NDX] < 0.99)
...
--> 386 cum_counts = np.cumsum(np.greater(counts, 0, dtype=np.int32))
387 start, end = np.searchsorted(cum_counts, [0, cum_counts[-1] - 1], side="right")
388 start = int(start)

TypeError: No loop matching the specified signature and casting was found for ufunc greater

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions