-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18395. Performance improvement in hadoop-common Text#find #4714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
ZanderXu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huxinqiu Thanks for your report. LGTM.
Can you add one UT with a very long string and timeout to test the performance of this method?
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
We're closing this stale PR because it has been open for 100 days with no activity. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
Description of PR
JIRA: HADOOP-18395
The current implementation reset src and tgt to the mark and continues searching when tgt has remaining and src expired first. which is probably not necessary.
In some cases, this commit can reduce the complexity from O(n²) to O(n), which can significantly improve performance, as in the following example.
How was this patch tested?
unit test in org.apache.hadoop.io.TestText#testFind