Skip to content

[Support] Recognise the '+' char for positive integers #135856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

da-viper
Copy link
Contributor

Fixes #45326

Fixes llvm#45326

Signed-off-by: Ebuka Ezike <yerimyah1@gmail.com>
@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-llvm-support

Author: Ebuka Ezike (da-viper)

Changes

Fixes #45326


Full diff: https://github.com/llvm/llvm-project/pull/135856.diff

2 Files Affected:

  • (modified) llvm/lib/Support/StringRef.cpp (+3)
  • (modified) llvm/unittests/ADT/StringRefTest.cpp (+15-13)
diff --git a/llvm/lib/Support/StringRef.cpp b/llvm/lib/Support/StringRef.cpp
index 4f5fcb4857e80..bdf7a9aa5c7e0 100644
--- a/llvm/lib/Support/StringRef.cpp
+++ b/llvm/lib/Support/StringRef.cpp
@@ -409,6 +409,9 @@ static unsigned GetAutoSenseRadix(StringRef &Str) {
 
 bool llvm::consumeUnsignedInteger(StringRef &Str, unsigned Radix,
                                   unsigned long long &Result) {
+  // Consume the + value
+  Str.consume_front("+");
+
   // Autosense radix if not specified.
   if (Radix == 0)
     Radix = GetAutoSenseRadix(Str);
diff --git a/llvm/unittests/ADT/StringRefTest.cpp b/llvm/unittests/ADT/StringRefTest.cpp
index ec9cdc197597d..55d222a915ab5 100644
--- a/llvm/unittests/ADT/StringRefTest.cpp
+++ b/llvm/unittests/ADT/StringRefTest.cpp
@@ -622,19 +622,21 @@ TEST(StringRefTest, Hashing) {
 struct UnsignedPair {
   const char *Str;
   uint64_t Expected;
-} Unsigned[] =
-  { {"0", 0}
-  , {"255", 255}
-  , {"256", 256}
-  , {"65535", 65535}
-  , {"65536", 65536}
-  , {"4294967295", 4294967295ULL}
-  , {"4294967296", 4294967296ULL}
-  , {"18446744073709551615", 18446744073709551615ULL}
-  , {"042", 34}
-  , {"0x42", 66}
-  , {"0b101010", 42}
-  };
+} Unsigned[] = {{"0", 0},
+                {"255", 255},
+                {"256", 256},
+                {"65535", 65535},
+                {"65536", 65536},
+                {"4294967295", 4294967295ULL},
+                {"4294967296", 4294967296ULL},
+                {"18446744073709551615", 18446744073709551615ULL},
+                {"042", 34},
+                {"0x42", 66},
+                {"0b101010", 42},
+                {"+42", 42},
+                {"+042", 34},
+                {"+0x42", 66},
+                {"+0b101010", 42}};
 
 struct SignedPair {
   const char *Str;

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-llvm-adt

Author: Ebuka Ezike (da-viper)

Changes

Fixes #45326


Full diff: https://github.com/llvm/llvm-project/pull/135856.diff

2 Files Affected:

  • (modified) llvm/lib/Support/StringRef.cpp (+3)
  • (modified) llvm/unittests/ADT/StringRefTest.cpp (+15-13)
diff --git a/llvm/lib/Support/StringRef.cpp b/llvm/lib/Support/StringRef.cpp
index 4f5fcb4857e80..bdf7a9aa5c7e0 100644
--- a/llvm/lib/Support/StringRef.cpp
+++ b/llvm/lib/Support/StringRef.cpp
@@ -409,6 +409,9 @@ static unsigned GetAutoSenseRadix(StringRef &Str) {
 
 bool llvm::consumeUnsignedInteger(StringRef &Str, unsigned Radix,
                                   unsigned long long &Result) {
+  // Consume the + value
+  Str.consume_front("+");
+
   // Autosense radix if not specified.
   if (Radix == 0)
     Radix = GetAutoSenseRadix(Str);
diff --git a/llvm/unittests/ADT/StringRefTest.cpp b/llvm/unittests/ADT/StringRefTest.cpp
index ec9cdc197597d..55d222a915ab5 100644
--- a/llvm/unittests/ADT/StringRefTest.cpp
+++ b/llvm/unittests/ADT/StringRefTest.cpp
@@ -622,19 +622,21 @@ TEST(StringRefTest, Hashing) {
 struct UnsignedPair {
   const char *Str;
   uint64_t Expected;
-} Unsigned[] =
-  { {"0", 0}
-  , {"255", 255}
-  , {"256", 256}
-  , {"65535", 65535}
-  , {"65536", 65536}
-  , {"4294967295", 4294967295ULL}
-  , {"4294967296", 4294967296ULL}
-  , {"18446744073709551615", 18446744073709551615ULL}
-  , {"042", 34}
-  , {"0x42", 66}
-  , {"0b101010", 42}
-  };
+} Unsigned[] = {{"0", 0},
+                {"255", 255},
+                {"256", 256},
+                {"65535", 65535},
+                {"65536", 65536},
+                {"4294967295", 4294967295ULL},
+                {"4294967296", 4294967296ULL},
+                {"18446744073709551615", 18446744073709551615ULL},
+                {"042", 34},
+                {"0x42", 66},
+                {"0b101010", 42},
+                {"+42", 42},
+                {"+042", 34},
+                {"+0x42", 66},
+                {"+0b101010", 42}};
 
 struct SignedPair {
   const char *Str;

@dwblaikie
Copy link
Collaborator

could you check a few callers of this function to see if they'd generally be on-board with leading + support? (like, does this make any use cases more permissive than they intend to be/supporting things they don't intend to support)

@da-viper
Copy link
Contributor Author

It will only be more permissive if the + at the start of a stringRef signifies something else.

I did check the callers of the StringRef::getAsInteger function, this is not the case.

The only new change was in the llvm-symbolizer to handle when the address starts with a + to match addr2line

@dwblaikie
Copy link
Collaborator

It will only be more permissive if the + at the start of a stringRef signifies something else.

Even if it doesn't mean anything else - if the consumer isn't intending to accept leading + and this change causes it to accept leading + that might be a problem (tangential story: Recently clang accidentally started accepting trailing , in function calls (eg: f(a, b,)) - google's internal codebase rapidly grew uses of this now allowed syntax (doubt anyone did it especially intentionally - but enough programmers making enough mistakes without a compiler telling them not to, some will sneak in) that had to be cleaned up so the codebase could compile once the bug was fixed)

I did check the callers of the StringRef::getAsInteger function, this is not the case.

The only new change was in the llvm-symbolizer to handle when the address starts with a + to match addr2line

That change seems fine - but yeah, still unsure about making everyone else accept leading + when they might not intend to.

Wouldn't mind a second or third opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

StringRef::getAsInteger doesn't support "+1"
3 participants