-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-17042 Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode #5730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @goiri, Please kindly review this PR as well if you have bandwidth. thanks, |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Regarding the two checkstyle VisibilityModifier warnings, we don't make these Metrics private. Leave them as they are. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Hi @goiri, I don't expect to make further changes at this point. It would be great appreciated if you could review this PR. thanks, |
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MetricsAsserts.java
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/metrics/RpcMetrics.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Hi @goiri, The build infra seems to have some problem. logfile links for this build seem to point to empty pages. the compile logfile for the second to last build has a cmake program error.
I just removed an extra space since the last clean build. it shouldn't cause these errors. |
Still not sold on the readability aspect but I cannot think of a cleaner solution so let's go with this. |
🎊 +1 overall
This message was automatically generated. |
Thanks for reviewing and committing this PR! Appreciate your help! |
…trics for Namenode (apache#5730)
…trics for Namenode (apache#268) * HDFS-17042 Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode (apache#5730) (apache#5804) --------- Co-authored-by: Xing Lin <linxingnku@gmail.com>
Description of PR
Add two new types of metrics to the existing NN RpcMetrics/RpcDetailedMetrics. These two metrics can then be used as part of SLA/SLO for the HDFS service.
RpcCallSuccesses: it measures the number of RPC requests where they are successfully processed by a NN (e.g., with a response with an RpcStatus RpcStatusProto.SUCCESS). Then, together with RpcQueueNumOps (which refers the total number of RPC requests), we can derive the RpcErrorRate for our NN, as (RpcQueueNumOps - RpcCallSuccesses) / RpcQueueNumOps.
OverallRpcProcessingTime for each RPC method: this metric measures the overall RPC processing time for each RPC method at the NN. It covers the time from when a request arrives at the NN to when a response is sent back. We are already emitting processingTime for each RPC method today in RpcDetailedMetrics. We want to extend it to emit overallRpcProcessingTime for each RPC method, which includes enqueueTime, queueTime, processingTime, responseTime, and handlerTime.
How was this patch tested?