-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[clang] Improve getFileIDLocal binary search. #146510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang Author: Haojian Wu (hokein) ChangesAvoid reading the Full diff: https://github.com/llvm/llvm-project/pull/146510.diff 1 Files Affected:
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index a05d6c16caa32..0080d180a3776 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -812,6 +812,8 @@ FileID SourceManager::getFileIDSlow(SourceLocation::UIntTy SLocOffset) const {
/// loaded one.
FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
assert(SLocOffset < NextLocalOffset && "Bad function choice");
+ assert(SLocOffset >= LocalSLocEntryTable[0].getOffset() &&
+ "Invalid SLocOffset");
// After the first and second level caches, I see two common sorts of
// behavior: 1) a lot of searched FileID's are "near" the cached file
@@ -855,35 +857,24 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
break;
}
- NumProbes = 0;
- while (true) {
- unsigned MiddleIndex = (GreaterIndex-LessIndex)/2+LessIndex;
- SourceLocation::UIntTy MidOffset =
- getLocalSLocEntry(MiddleIndex).getOffset();
-
- ++NumProbes;
-
- // If the offset of the midpoint is too large, chop the high side of the
- // range to the midpoint.
- if (MidOffset > SLocOffset) {
- GreaterIndex = MiddleIndex;
- continue;
- }
+ while (LessIndex < GreaterIndex) {
+ ++NumBinaryProbes;
- // If the middle index contains the value, succeed and return.
- if (MiddleIndex + 1 == LocalSLocEntryTable.size() ||
- SLocOffset < getLocalSLocEntry(MiddleIndex + 1).getOffset()) {
- FileID Res = FileID::get(MiddleIndex);
+ unsigned MiddleIndex = LessIndex + (GreaterIndex - LessIndex) / 2;
- // Remember it. We have good locality across FileID lookups.
- LastFileIDLookup = Res;
- NumBinaryProbes += NumProbes;
- return Res;
- }
+ SourceLocation::UIntTy MidOffset =
+ LocalSLocEntryTable[MiddleIndex].getOffset();
- // Otherwise, move the low-side up to the middle index.
- LessIndex = MiddleIndex;
+ if (MidOffset <= SLocOffset)
+ LessIndex = MiddleIndex + 1;
+ else
+ GreaterIndex = MiddleIndex;
}
+
+ // At this point, LessIndex is the index of the *first element greater than*
+ // SLocOffset. The element we are actually looking for is the one immediately
+ // before it.
+ return LastFileIDLookup = FileID::get(LessIndex - 1);
}
/// Return the FileID for a SourceLocation with a high offset.
|
|
||
// Otherwise, move the low-side up to the middle index. | ||
LessIndex = MiddleIndex; | ||
if (MidOffset <= SLocOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why equal? Shouldn't "we found this!" mean we exit the loop? And does the '+1' here cause us to miss it if the offset is exactly 'midoffset'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an upper_bound binary search, it finds the first element which is greater than SLocOffset
. We will -1
for the final result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, hrmph.... I guess I see here. I guess that the 'we just managed to be searching for the very beginning of this file' is rare enough to not be worth a branch.
Though, is this safe (the +1/-1 dance)with 'GreaterIndex'? Despite its name, GreaterIndex is initialized to the 'size' (so not really an index... sigh). Though I guess the linear search ensures we've already checked the end....
I think I've convinced myself this is right enough. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, initializing GreaterIndex to size can be a bit confusing, but that initial value is never actually used.
GreaterIndex always points to an entry whose offset is strictly greater than SLocOffse
t. The search space is defined as [LessIndex, GreaterIndex)
, so we know by design that GreaterIndex is never a valid result.
|
||
// Otherwise, move the low-side up to the middle index. | ||
LessIndex = MiddleIndex; | ||
if (MidOffset <= SLocOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, hrmph.... I guess I see here. I guess that the 'we just managed to be searching for the very beginning of this file' is rare enough to not be worth a branch.
Though, is this safe (the +1/-1 dance)with 'GreaterIndex'? Despite its name, GreaterIndex is initialized to the 'size' (so not really an index... sigh). Though I guess the linear search ensures we've already checked the end....
I think I've convinced myself this is right enough. Thanks.
Avoid reading the
LocalSLocEntryTable
twice per loop iteration. NFC.https://llvm-compile-time-tracker.com/compare.php?from=0b6ddb02efdcbdac9426e8d857499ea0580303cd&to=1aa335ccfb07ba96177b89b1933aa6b980fa14f6&stat=instructions:u