-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18631 (ADDENDUM) Use LogCapturer to match audit log pattern and remove hdfs async audit log configs #5451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that, we don't need to delete the file as a post test cleanup as other parallel tests might also be writing audit logs at the same location at the same time.
Today, they might not collide, but tomorrow new tests or changes in the code, they might collide.
Have different & Unique location per test
This is an option and I thought of moving the entire test to hdfs-client module. Do you think that would be good? Moving the test to module other than hadoop-hdfs is the only way we can cleanly use different file location of audit logs. Does client module look good to you? Or maybe I can just move those 3 tests out of TestFsck and move to new test named say TestFsckAuditLogs in hadoop-hdfs-client module? |
Basically the clean way is to have different log4j properties file for this test. Hence, the need to move the test to different module. |
Nopes, I don't think so we can move the test, have to find some other way out. |
Btw no other test is actually deleting the file, this was the only test in the previous PR. Not deleting the file will not cause any issues for other tests too because other tests too will read from the same file as this test. |
The path was unique earlier
|
That's because the previous version was using dynamically added appenders and disregarding the actual appenders set in log4j properties, this was a hack and it would not work with log4j2. Hence we changed that with previous PR. Now the tests share the same file location for the audit log, because it's the log4j properties that all tests share. We can make a new custom log4j properties and reload it dynamically, that's an option but again it would be another hack because in log4j2, while this is allowed, the APIs are not public and can break compatibilities. |
Here is the most imp distinguisher for tests before and after PR #5418 : Before patch: The tests were deciding which log file to use and also disabling async appender soon after the expected logs were in. Hence, the tests can match the expected regex. This is hack because overriding log4j properties is not compatible across log4j versions. After patch: The tests share the same log file for audit log and no longer applies hack of changing the log file dynamically and stopping appenders when needed etc. Hence, now the tests are made to search for regex only for "specific file operations" because audit logs will have operations for other tests as well. We no longer have any hacks here that log4j2 does not support (or recommend doing). |
The options we have now:
|
&
I am against this. First one is being too much confident, that tests won't collide and today we know they aren't. tommorow if suddenly they do, other folks who doesn't know about it would be struggling to figure this out. We don't want to leave the code in a worse state than we have today. Second one I already have said the reason above
If they are doing only, then we have a solution ready only, lets do it or pull that into a util and use it everywhere. I don't have hard feeling about other options(as of now), but this one sounds best to me |
It's just another hack that's why I was bit reluctant to use it, also log4j2 has no public API for it, this can break anytime :( |
Sorry Ayush, this would not work as it would screw up state of MiniDfsCluster and Namenode audit logger for other tests too. |
Please ignore this comment. We have better solution (the next comment section).
|
Ayush, we have a better way. We can avoid accessing file directly and use log capture utility. I have tested the change. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestAuditLog is also doing this File logic in the last PR, should change it to use the same approach there as well.
Rest in general things looks good to me now. To me the present stuff looks ok though... |
Yeah that's fine too. Thank you @ayushtkn for the review. And sure let's wait for @jojochuang too. |
Ok, I think I got misled then. Do you mean to say we are reading via different mechanisms but reading from the same place? And one more to try if I delete the file while using appender, will the test using appender fail? |
Basically they will read from their own WriterAppender so there is no sync issue while reading.
Multiple tests can read logs for the given single logger instance (slf4j/log4j).
So, let's say tomorrow we write a new test that also needs to read namenode audit logs, the test should just create new LogCapturer object and extract logs from it, that's it. It doesn't interfere with any other tests, if run simultaneously. TestFsck has it's own writer appender. TestAuditLogs on the other hand is reading directly from the file which is used by log4j properties as primary RFA appender, which I think is good so that at least we have one test that also validates output from primary appender directly and verifies regex. Now even if TestAuditLogs deletes the file such that every log produced by TestFsck are gone, it's still not a concern for TestFsck because TestFsck has it's own writer appender instance as a secondary appender and as part of reading logs, it now uses this secondary appender and no longer relies on primary appender (file). FWIW, I believe LogCapturer is really nice utility. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this somehow in general looks ok to me from test correctness if you have verified now running parallel doesn't affect each other.
Not chasing this log4j upgrade thing. So, will leave it to @jojochuang to confirm if everything is cool here. If I don't hear it from anyone chasing this upgrade, will revert the original PR by early next week, so that original reviewers can come back, that is the only way I can get some attention of original reviewers...
@ayushtkn I am requesting you to please not revert original PR. This PR is anyways restricting the log output to individual test by using their own writer appenders. |
If you run parallel tests, you will be easily able to verify that logs produced by one test is not being written to another test's output as both have their own appenders only responsible for their individual log outputs. Hence, no two tests using LogCapturer utility will collide i.e. they will ever use each other's log output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Still I will wait one week before I merge this to give enough time to others
Thank you @ayushtkn! |
@virajjasani was going to commit this, considering @jojochuang ain't interested in this, but the original now looks incompatible to me, it took away some abilities which we could do today and the Deprecated is also misunderstood, AFAIK deprecated means, we aren't gonna maintain this and this won't work in future, rather than it is not working use something else |
Ayush, I have added a temporary appender (which will later be removed when we move to log4j2 finally) that provides us the same abilities. I posted on the parent Jira as well HADOOP-16206 :
|
"Incompatible change as three hdfs-site configs are no longer in use, they are to be replaced with log4j properties." |
I have added release notes for HADOOP-18631 to explain the same. Also, both parent Jira (HADOOP-16206) as well as this current Jira (HADOOP-18631) both are marked Incompatible change. For the current Jira, I marked it but the parent Jira was marked Incompatible long back. This is the log added with the current Jira:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take my vote back.
Ok, this is incompatible, you must be having enough agreement for this change, not getting into that...
It is incompatible ok, but the configs deprecated? They aren't deprecated they are removed itself.
+1 merging this one. In any case, it is unfortunately impossible to upgrade to log4j2 without any friction. We do what we can do at code review, but ultimately it's up to the downstream application developers to adopt and change. Most likely we'll find issues during integration tests but I don't see any other way. |
Still Ok, I don't want to get into this activity, I will trust you on this, But fundamentally why are you marking the config deprecated, just remove it, It isn't working why you want to keep it marked as deprecated, you aren't removing it in the "future" release, you "removed" it.
Hope they do that.... |
Ok sir, updated the PR. Let me update title to reflect this and also update release notes accordingly. |
Wei-Chui go ahead committing, no objections from my side. Test gets fixed. I am happy with just that. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@ayushtkn @jojochuang does the current state of the PR look good? |
Thanks Ayush I'll merge it. |
…nd remove hdfs async audit log configs (apache#5451)
We don't need to delete the audit log file as such. Even with other tests writing log entries, TestFsck should be able to find the specific pattern of audit log entries. For that, we don't need to delete the file as a post test cleanup as other parallel tests might also be writing audit logs at the same location at the same time.
Remove configs: