Open
Description
I'm running into a strange error. During training (after several thousand iterations), my training script crashes with the error
libprotobuf FATAL google/protobuf/wire_format.cc:830] CHECK failed: (output->ByteCount()) == (expected_endpoint): : Protocol message serialized to a size different from what was originally expected. Perhaps it was modified by another thread during serialization?
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (output->ByteCount()) == (expected_endpoint): : Protocol message serialized to a size different from what was originally expected. Perhaps it was modified by another thread during serialization?
Command terminated by signal 6
It doesn't look like I'm running out of memory on my gpu. Could this be a result of writing too much information to tensorboard?
Metadata
Metadata
Assignees
Labels
No labels