-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[oneDNN] Further ops refactoring of oneDNN cache access #33515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- compilation fix
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
bool isBwdCached() { | ||
const std::string key_pd = key_ + "@bwd_pd"; | ||
bwd_pd_ = std::static_pointer_cast<typename TBackward::primitive_desc>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi I saw you name key_fpd, Maube you could name here backward pd key as key_bpd ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally key_pd is used everywhere it just at that particular moment I needed two keys so I added key_fpd.
const std::string key_fpd = key_ + "@fwd_pd"; | ||
fwd_pd_ = std::static_pointer_cast<typename TForward::primitive_desc>( | ||
dev_ctx_.GetBlob(key_fpd)); | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwd_pd_ get value by dev_ctx_.GetBlob, then could it be null_ptr? If it is null will it still return true or something like "return bwd_pd_ == nullptr ? false : true" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if bwd_pd_ is valid so then should be fwd_pd_ . But it is a valid point to add some PADDLE_ENFORCE making sure that fwd_pd_ is also valid .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidanqing-intel I added PADDLE_ENFORCE to make sure fwd_pd_ is set when bwd_pd_ is cached. Please sontinue your review
template <typename Arg, typename... Args> | ||
void AcquireForwardPrimitiveDescriptorNonBlocking(Arg&& first_arg, | ||
Args&&... args) { | ||
// This is used when we can recreate FWD PD in BWD so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines comments are describing void AcquireForwardPrimitiveDescriptor(Arg&& first_arg, Args&&... args)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still a relevant comment to this function
void AcquireBackwardPrimitiveDescriptorNonBlocking(Args&&... args) { | ||
// fwd_pd_ is set during grad by calling | ||
// AcquireForwardPrimitiveDescriptorNonBlocking | ||
// AcquireForwardPrimitiveDescriptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does these comments "// fwd_pd_ is set during grad by calling // AcquireForwardPrimitiveDescriptor" still suit here? Here it looks only bwd_pd_ is created and cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AcquireBackwardPrimitiveDescriptor is creating BWD_PD only that is correct , but when implementing Handler for grad(backward) you also call AcquireForwardPrimitiveDescriptor which will create FWD_PD. So in grad handler we call AcquireForwardPrimitiveDescriptor and AcquireBackwardPrimitiveDescriptor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks a lot for explanation !
@juncaipeng Hi, There is a problem PR-CI-Windows build. I had it rerunning three times and still it failes due to not related to PR changes errors: 2021-06-12 22:50:51 failed with: Please help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jczaja Wangzhen said it was Windows machine issue last week and now it's recovered. But me and wanghzen both can not rerun it because CI says: "This commit exceeds rerun limitation." Please change this PR a bit and submit a new commit so CI could restart. |
891032f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@juncaipeng This CheckPRTemplate is being running for many hours . Could you please advice how to make it finished? |
In this case, it is necessary to modify the describe of the PR, so the CheckPRTemplate will rerun. |
PR types
Performance optimization
PR changes
OPs
Describe