You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
/usr1/lm/model/transformer/cv/imagenet/compaire_speed_with_torch.py:218: UserWarning: You have chosen a specific GPU. This will completely disable data parallelism.
warnings.warn('You have chosen a specific GPU. This will completely '
Use GPU: 0 for training
=> creating model
=> Dummy data is used!
oneflow模型训练总耗时:477.7218196541071
terminate called after throwing an instance of 'oneflow::RuntimeException'
what(): Error: CUDA out of memory. Tried to allocate 32.0 MB
You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error.
Stack trace (most recent call last) in thread 1363223:
File "virtual_machine.cpp", line 0, in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void ()(vm::ThreadCtx, std::function<void (vm::ThreadCtx*)> const&), vm::ThreadCtx*, VirtualMachine::CreateThreadCtx(Symbol, StreamType, unsigned long)::{lambda(vm::ThreadCtx*)#5}> > >::_M_run()
File "virtual_machine.cpp", line 0, in (anonymous namespace)::WorkerLoop(vm::ThreadCtx*, std::function<void (vm::ThreadCtx*)> const&)
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a064b07, in vm::ThreadCtx::TryReceiveAndRun()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f6829ffbf57, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a000335, in vm::Instruction::Compute()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a095ae6, in vm::FuseInstructionPolicy::Compute(vm::Instruction*)
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a000335, in vm::Instruction::Compute()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a006479, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
File "op_call_instruction_policy.cpp", line 0, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)::{lambda(char const*)#1}::operator()(char const*) const [clone .constprop.0]
File "op_call_instruction_policy.cpp", line 0, in details::Throw::operator=(Error&&) [clone .constprop.0]
File "error.cpp", line 0, in ThrowError(std::shared_ptr const&) [clone .cold]
Aborted (Signal sent by tkill() 1362859 0)
Aborted (core dumped)
|
The text was updated successfully, but these errors were encountered:
Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
/usr1/lm/model/transformer/cv/imagenet/compaire_speed_with_torch.py:218: UserWarning: You have chosen a specific GPU. This will completely disable data parallelism.
warnings.warn('You have chosen a specific GPU. This will completely '
Use GPU: 0 for training
=> creating model
=> Dummy data is used!
oneflow模型训练总耗时:477.7218196541071
terminate called after throwing an instance of 'oneflow::RuntimeException'
what(): Error: CUDA out of memory. Tried to allocate 32.0 MB
You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error.
Stack trace (most recent call last) in thread 1363223:
File "virtual_machine.cpp", line 0, in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void ()(vm::ThreadCtx, std::function<void (vm::ThreadCtx*)> const&), vm::ThreadCtx*, VirtualMachine::CreateThreadCtx(Symbol, StreamType, unsigned long)::{lambda(vm::ThreadCtx*)#5}> > >::_M_run()
File "virtual_machine.cpp", line 0, in (anonymous namespace)::WorkerLoop(vm::ThreadCtx*, std::function<void (vm::ThreadCtx*)> const&)
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a064b07, in vm::ThreadCtx::TryReceiveAndRun()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f6829ffbf57, in vm::EpStreamPolicyBase::Run(vm::Instruction*) const
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a000335, in vm::Instruction::Compute()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a095ae6, in vm::FuseInstructionPolicy::Compute(vm::Instruction*)
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a000335, in vm::Instruction::Compute()
Object "/usr1/lm/model/oneflow-test/oneflow-1.0.0/build/liboneflow.so", at 0x7f682a006479, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)
File "op_call_instruction_policy.cpp", line 0, in vm::OpCallInstructionPolicy::Compute(vm::Instruction*)::{lambda(char const*)#1}::operator()(char const*) const [clone .constprop.0]
File "op_call_instruction_policy.cpp", line 0, in details::Throw::operator=(Error&&) [clone .constprop.0]
File "error.cpp", line 0, in ThrowError(std::shared_ptr const&) [clone .cold]
Aborted (Signal sent by tkill() 1362859 0)
Aborted (core dumped)
|
The text was updated successfully, but these errors were encountered: