-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-16266. Add remote port information to HDFS audit log #3538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Hi @tomscut Is it necessary to add port to audit log. I am afraid that this will have a big influence. Because many users analyze audit log as table and this maybe change format. |
Thanks @ferhui for your review. I also have similar concerns, so I published a discussion to HADOOP community, hoping to let the members of the community know about this matter and then give some suggestions. However, Adding port is to modify the internal content of the IP field, which has little impact on the overall layout. In our cluster, we parse the audit log through Vector and send the data to Kafka, which is unaffected. |
This comment has been minimized.
This comment has been minimized.
Hi @jojochuang @tasanuma @goiri @Hexiaoqiao, could you please also take a look at this? Do you think it is reasonable to add remote port into ip field like |
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
Outdated
Show resolved
Hide resolved
+1 to make it optional. |
Thanks @akira Ajisaka ***@***.***> for your comments and
suggestion. I will update it ASAP.
Akira Ajisaka ***@***.***> 于2021年10月15日周五 下午3:27写道:
… And we can make it optional as @iwasakims <https://github.com/iwasakims>
suggested. Thank you very much.
+1 to make it optional.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3538 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANEUPM7J6PDPOGD5UJXX74LUG7JW3ANCNFSM5FWQZO3A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
It's good that port is optional. |
Thanks @ferhui for your comment. I made it configurable and committed just now. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This failed unit test is unrelated to the change. |
Thanks for updating the PR, @tomscut.
|
Thanks @tasanuma and your colleagues for your good advice. And sorry for the late reply.
|
I haven't gone through the entire discussion/code. Just that whether we should modify the existing field or add a new one. Technically both are correct and I don't see any serious issue with either(not thinking too deep). But I feel for the parsers to adapt, if there was a new field, it might be little bit more easy, Rather than trying to figure out whether the existing field has a port or not. Just my thoughts, I am Ok with whichever way most people tend to agree. |
The API is declared Public, Evolving. If it stays in Hadoop 3.4.0 I am fine with it. We used to have an audit logger (Cloudera Navigator) that extends the AuditLogger interface. But we've moved away from that. Performance: CallerContext: |
Thanks @ayushtkn for your comments and suggestions. |
Thanks @jojochuang for your careful consideration and advice. I think it's a very good idea to add remote port to the CallerContext, these will not affect the compatibility @tasanuma mentioned. After the user enable the CallerContext, we add clientPort to the CallerContext, similar to how the Router sets clientIp to the CallerContext. |
Previously, I didn't think it would be too much of an impact to append a port to the IP field if the feature was made configurable, but users might need to change the resolution rules. Considering compatibility, @tasanuma suggests adding fields, and @jojochuang suggests putting port in the CallerContext. If we put the port in the CallerContext, it will not affect field resolution. The content in the callerContext is also dynamic, which is more flexible. Thank you all for your advice and help. I will update the PR later. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Hi @tasanuma @jojochuang @ayushtkn @ferhui , could you please take a look at this PR again. Thank you very much. The code was submitted with multiple versions, resulting in some invalid changes (removing some white space). |
...s-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm +1
Thanks @jojochuang for your comment. |
Thanks for updating it, @tomscut.
In this case, |
How about using the word of "port" instead of "clientPort" here, and adding "clientPort" as the actual client server port for RBF in another JIRA? |
Thank you @tasanuma very much for your test. Because the router acts as a client of the Namenode, in this case the port of router is indeed regarded as a |
Thank you @tasanuma for your advice. Based on your suggestion, I have an idea like this: We leave the ClientPort field for now.
In both cases, only one |
@tomscut Thanks for your thoughts. That makes sense to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Hi @aajisaka , could you please review this again. Thanks a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Thank you @tomscut, @jojochuang, and @tasanuma for updating the patch and the discussion.
In this case, clientIp:1.1.1.1 is the IP of the client server, but clientPort:33070 is the port of the DFS Router (2.2.2.2), not the one of the client server. It would be confusing for the users.
I suppose the name clientIp
should be more specific, such as routerClientIp
. And then we can add the router client port as routerClientPort
.
Thanks @aajisaka for your review and comment. I will make some changes in the other JIRA if necessary, because this might involve the callContext from Router. What do you think of this? Thank you. |
Agreed. Let's discuss in a separate JIRA. |
Thanks @aajisaka for your quick reply. |
Hi @tasanuma @jojochuang @aajisaka , could you please help merge this PR. And I will open a new JIRA based on it. Thanks a lot. |
There are three +1s. So I'd like to merge it. |
Thanks for your contribution, @tomscut. Thanks all for your discussion and reviews. |
Thanks @tasanuma for the merge. Thanks all for your reviews and suggestions. |
Reviewed-by: Akira Ajisaka <aajisaka@apache.org> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org> Cherry-picked from 359b03c by Owen O'Malley
Reviewed-by: Akira Ajisaka <aajisaka@apache.org> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org> Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org> Cherry-picked from 359b03c by Owen O'Malley
JIRA: HDFS-16266
In our production environment, we occasionally encounter a problem where a user submits an abnormal computation task, causing a sudden flood of requests, which causes the queueTime and processingTime of the Namenode to rise very high, causing a large backlog of tasks.
We usually locate and kill specific Spark, Flink, or MapReduce tasks based on metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but there is no port information, so it is difficult to locate specific processes sometimes. Therefore, I propose that we add the port information to the audit log, so that we can easily track the upstream process.
Currently, some projects contain port information in audit logs, such as Hbase and Alluxio. I think it is also necessary to add port information for HDFS audit logs.
Before:

After:
