-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support XPU DDP training and autocast for LowBitMatmul #9167
Conversation
if isinstance(value, torch.Tensor): | ||
is_eligible = ( | ||
value.is_floating_point() | ||
and value.is_cuda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is_cuda
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -378,8 +385,7 @@ def forward(self, x: torch.Tensor): | |||
result = result.view(new_shape) | |||
if self.bias is not None: | |||
result += self.bias | |||
|
|||
return result.to(x.dtype) | |||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line of change causes this issue: https://github.com/analytics-zoo/nano/issues/639
@yangw1234 Please take a look.
* support autocast in low bit matmul * Support XPU DDP training * fix amp
Description
1. Why the change?
Faster end to end training time using multiple intel GPUs
2. User API changes
None
3. Summary of the change
4. How to test?
Manually tested using 4 PVC 1100 on alpaca dataset