-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#1355] fix(client): Netty client will leak when decoding responses #1455
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1455 +/- ##
============================================
+ Coverage 54.19% 55.43% +1.23%
- Complexity 2775 2800 +25
============================================
Files 423 403 -20
Lines 24185 21849 -2336
Branches 2058 2062 +4
============================================
- Hits 13108 12112 -996
+ Misses 10266 8985 -1281
+ Partials 811 752 -59 ☔ View full report in Codecov by Sentry. |
@@ -293,6 +293,9 @@ public void checkProcessedBlockIds() { | |||
|
|||
@Override | |||
public void close() { | |||
if (sdr != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this modification? We have released readBuffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will not cause any side effects, and it will be more readable? I can revert this if you want.
After my stress test, I found that we need this to avoid potential memory leaks.
try { | ||
msg = Message.decode(curType, frame); | ||
} finally { | ||
boolean shouldRelease = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we release the frame if the msg is null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should. Although this can barely happen, we still need to fix this. Done.
msg = Message.decode(curType, frame); | ||
} finally { | ||
boolean shouldRelease = | ||
msg != null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extract a method for this logic? It's easy to test, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this?
private static boolean shouldRelease(Message msg) {
boolean shouldRelease =
msg == null
|| (msg != null
&& (msg.body() == null
|| (msg.body().byteBuf() != null
&& msg.body().byteBuf().readableBytes() == 0)));
return shouldRelease;
}
For testing purposes, maybe private -> public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, package-available is enough. We don't need public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this?
private static boolean shouldRelease(Message msg) { boolean shouldRelease = msg == null || (msg != null && (msg.body() == null || (msg.body().byteBuf() != null && msg.body().byteBuf().readableBytes() == 0))); return shouldRelease; }
For testing purposes, maybe private -> public?
You can simplify the code like
private static boolean shouldRelease(Message msg) {
if (msg == null) {
return true;
}
if (mgs.body() == null) {
return true;
}
boolean shouldRelease = (msg.body().byteBuf() != null
&& msg.body().byteBuf().readableBytes() == 0)));
return shouldRelease;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also, I remove the private
keyword to make it package available.
@jerqi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a UT for this pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UTs added. @jerqi
Also, I have conveniently fixed an issue where a method parameter was missing. The error
parameter was forgotten to be passed to the UnsupportedOperationException
.
https://github.com/apache/incubator-uniffle/pull/1455/files#diff-312deeac76db5d5c48ab0caf09429615f3719eadc5ef7fbd8936d31b0d8049d3R32
static boolean shouldRelease(Message msg) { | ||
if (msg == null || msg.body() == null) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (msg.body().byteBuf() == null) {
return true;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @rickyma
What changes were proposed in this pull request?
The current code logic is that the
ByteBuf
is only released whenmsg.body() == null
. However, whenmsg.body != null
,msg.body().byteBuf()
returns aNettyManagedBuffer.EMPTY_BUFFER
, and theByteBuf
is not released in this case, resulting in a memory leak issue every time decoding an RPC response from ShuffleServer.Over time, if the Spark Job runs long enough and there are enough requests, it will eventually cause a significant memory leak on the client side (Spark Executor).
The modifications to the other code are mainly for readability and enhanced protection, and will not cause any side effects.
Why are the changes needed?
To fix the memory leak issue in the Netty client when decoding RPC responses.
For #1359
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Rerun successfully and no more Netty leak logs.