-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-15202 Boost short circuit cache #1884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
...oject/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch. Quickly went through the patch and left a few non-critical comments.
@@ -141,6 +141,8 @@ | |||
"dfs.short.circuit.shared.memory.watcher.interrupt.check.ms"; | |||
int DFS_SHORT_CIRCUIT_SHARED_MEMORY_WATCHER_INTERRUPT_CHECK_MS_DEFAULT = | |||
60000; | |||
String DFS_SHORT_CIRCUIT_NUM = "dfs.short.circuit.num"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can i ask you to change this to dfs.client.short.circuit.num? all client configs starts with dfs.client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you change, please also update hdfs-default.xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jojochuang thank you for interesting! Done)
@@ -227,8 +237,8 @@ public String getConfString() { | |||
return confString; | |||
} | |||
|
|||
public ShortCircuitCache getShortCircuitCache() { | |||
return shortCircuitCache; | |||
public ShortCircuitCache getShortCircuitCache(long idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine. But what I realized is Hadoop applications don't really respect the annotation @InterfaceAudience.Private. You really want to gracefully deprecate a public method to avoid runtime issues. That is, keep the original getShortCircuitCache() method and let it call getShortCircuitCache(0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, keep the original getShortCircuitCache() method and let it call
Sure, done
Take a look please at this UT - is it ok? For example if clientShortCircuitNum is 3, when a lot of blockids of SSR are ***1, ***4, ***7, this situation will fall into a ShortCircuitCache. |
@pustota2009 |
rename dfs.short.circuit.num to dfs.client.short.circuit.num
return back getShortCircuitCache()
I understand what do you mean, thank you for explanation. We talked about this and agreed that we can go step by step. Could you merge this PR and I promise to do some research about other strategy like crc32 later? |
💔 -1 overall
This message was automatically generated. |
added comment
💔 -1 overall
This message was automatically generated. |
refact
💔 -1 overall
This message was automatically generated. |
@pustota2009 can you rebase the patch? |
@jojochuang of course, just created new PR |
actually, you just need to rebase, resolve the conflcits and push the new commits to your branch. No need to start a new PR. But let's move the discussion there. |
Improve HDFS-client massive reading performance.
The idea: create few instances ShortCircuit caches instead of one.
It helps avoid locks and lets CPU do job.