Skip to content

Python can't set the device / phase for net initialization #1700

Closed
@shelhamer

Description

cuDNN handles and such are acquired at net initialization when the cuDNN layers that require these resources are constructed. Since the Python interface only exposes set_device() as a method of Net instead of a module function, it is too late to actually set the device for cuDNN computation once the net is made. All cuDNN computation from Python is run on GPU 0 for this reason, and attempts to set other devices will fail with

status == CUDNN_STATUS_SUCCESS (8 vs. 0)  CUDNN_STATUS_EXECUTION_FAILED

due to the disagreement between initialization and execution.

@longjon has the workaround for now: use the environment variable CUDA_VISIBLE_DEVICES instead of using set_device.

The simplest fix is to expose set_device() and set_phase from the caffe module itself as functions. This is only a bandaid and changes the interface.

Of course The Right Idea is to make Net responsible for device and phase, set them at initialization, and never switch #1500... but that involves a few details.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions