-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove torch.cuda.is_available()
check when compiling ops
#3085
Conversation
@jinzhen-lin, thanks for your contribution. But can you please provide some more details on the issue fixed by this PR? In my experience, the commented code works fine on machines without gpu including this CI. Thanks! |
@tjruwase Exactly I mean compiling cuda ops on a machine without gpu. But the CI doesn't build ops. In the mentioned issue, we encountered an error since the So we need those nvcc arguments : DeepSpeed/op_builder/builder.py Lines 687 to 695 in 258d283
But those arguments are ignored since we do the cuda check here DeepSpeed/op_builder/builder.py Lines 622 to 631 in 258d283
The cuda check doesn't pass since we cannot get the true cuda version with I think |
@jinzhen-lin, thanks for your helpful explanation. It seems the problem is that we assume that build and target environments are the same. We recently started enabling DeepSpeed for CPU-only target environments, and we distinguish from GPU target environments by testing for GPU availability using Please share your thoughts on this. Thanks! @jeffra, @mrwyattii FYI |
@tjruwase Sorry for absence of cpu builds checking before PR. I notice that the cpu-only target environments was introduced recently (after v0.8.0) and deepspeed is mainly for gpu now. So we should always assume user want a cuda build, and we should do a cpu build when:
|
@microsoft-github-policy-service agree |
@jinzhen-lin, thanks for updating the PR. This is an improvement but not quite cross-compilation. Nevertheless, this will suffice for now. |
torch.cuda.is_available()
is not necessary here. And I would cause #2858 when compiling deepspeed >= 0.8.1 on a machine without gpu (e.g. docker image build).