Since torch 0.4, async
is deprecated in cuda because it is a reserved word in Python >= 3.7, non_blocking
should be used instead.
Please see this GitHub issue for more details.
If the next operation depends on your data, you won’t notice any speed advantage. However, if the asynchronous data transfer is possible, you might hide the transfer time in another operation.
Plase see this PyTorch forum for more details.
If we use view operation, the elements should be placed in the order of the array in the memory.
Otherwise, we need to apply .contiguous()
before .view()
operation.
"invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number"
torch 0.4.0
losses.update(loss.data[0], inputs.size(0))
top1.update(prec1[0], inputs.size(0))
top5.update(prec5[0], inputs.size(0))
torch 1.13.0
losses.update(loss.item(), inputs.size(0))
top1.update(prec1.item(), inputs.size(0))
top5.update(prec5.item(), inputs.size(0))