-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[libc] Change ctype to be encoding independent #110574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] Change ctype to be encoding independent #110574
Conversation
The previous implementation of the ctype functions assumed ASCII. This patch changes to a switch/case implementation that looks odd, but actually is easier for the compiler to understand and optimize.
ebc7830
to
795ba04
Compare
@llvm/pr-subscribers-libc Author: Michael Jones (michaelrj-google) ChangesThe previous implementation of the ctype functions assumed ASCII. Patch is 30.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110574.diff 14 Files Affected:
diff --git a/libc/src/__support/ctype_utils.h b/libc/src/__support/ctype_utils.h
index 91f6ce8cabd8d0..8521857ce765de 100644
--- a/libc/src/__support/ctype_utils.h
+++ b/libc/src/__support/ctype_utils.h
@@ -15,44 +15,556 @@
namespace LIBC_NAMESPACE_DECL {
namespace internal {
-// ------------------------------------------------------
-// Rationale: Since these classification functions are
-// called in other functions, we will avoid the overhead
-// of a function call by inlining them.
-// ------------------------------------------------------
+// -----------------------------------------------------------------------------
+// ****************** WARNING ******************
+// ****************** DO NOT TRY TO OPTIMIZE THESE FUNCTIONS! ******************
+// -----------------------------------------------------------------------------
+// This switch/case form is easier for the compiler to understand, and is
+// optimized into a form that is almost always the same as or better than
+// versions written by hand (see https://godbolt.org/z/qvrebqvvr). Also this
+// form makes these functions encoding independent. If you want to rewrite these
+// functions, make sure you have benchmarks to show your new solution is faster,
+// as well as a way to support non-ASCII character encodings.
-LIBC_INLINE static constexpr bool isalpha(unsigned ch) {
- return (ch | 32) - 'a' < 26;
+LIBC_INLINE static constexpr bool islower(int ch) {
+ switch (ch) {
+ case 'a':
+ case 'b':
+ case 'c':
+ case 'd':
+ case 'e':
+ case 'f':
+ case 'g':
+ case 'h':
+ case 'i':
+ case 'j':
+ case 'k':
+ case 'l':
+ case 'm':
+ case 'n':
+ case 'o':
+ case 'p':
+ case 'q':
+ case 'r':
+ case 's':
+ case 't':
+ case 'u':
+ case 'v':
+ case 'w':
+ case 'x':
+ case 'y':
+ case 'z':
+ return true;
+ default:
+ return false;
+ }
}
-LIBC_INLINE static constexpr bool isdigit(unsigned ch) {
- return (ch - '0') < 10;
+LIBC_INLINE static constexpr bool isupper(int ch) {
+ switch (ch) {
+ case 'A':
+ case 'B':
+ case 'C':
+ case 'D':
+ case 'E':
+ case 'F':
+ case 'G':
+ case 'H':
+ case 'I':
+ case 'J':
+ case 'K':
+ case 'L':
+ case 'M':
+ case 'N':
+ case 'O':
+ case 'P':
+ case 'Q':
+ case 'R':
+ case 'S':
+ case 'T':
+ case 'U':
+ case 'V':
+ case 'W':
+ case 'X':
+ case 'Y':
+ case 'Z':
+ return true;
+ default:
+ return false;
+ }
}
-LIBC_INLINE static constexpr bool isalnum(unsigned ch) {
- return isalpha(ch) || isdigit(ch);
+LIBC_INLINE static constexpr bool isdigit(int ch) {
+ switch (ch) {
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ return true;
+ default:
+ return false;
+ }
}
-LIBC_INLINE static constexpr bool isgraph(unsigned ch) {
- return 0x20 < ch && ch < 0x7f;
+LIBC_INLINE static constexpr int tolower(int ch) {
+ switch (ch) {
+ case 'A':
+ return 'a';
+ case 'B':
+ return 'b';
+ case 'C':
+ return 'c';
+ case 'D':
+ return 'd';
+ case 'E':
+ return 'e';
+ case 'F':
+ return 'f';
+ case 'G':
+ return 'g';
+ case 'H':
+ return 'h';
+ case 'I':
+ return 'i';
+ case 'J':
+ return 'j';
+ case 'K':
+ return 'k';
+ case 'L':
+ return 'l';
+ case 'M':
+ return 'm';
+ case 'N':
+ return 'n';
+ case 'O':
+ return 'o';
+ case 'P':
+ return 'p';
+ case 'Q':
+ return 'q';
+ case 'R':
+ return 'r';
+ case 'S':
+ return 's';
+ case 'T':
+ return 't';
+ case 'U':
+ return 'u';
+ case 'V':
+ return 'v';
+ case 'W':
+ return 'w';
+ case 'X':
+ return 'x';
+ case 'Y':
+ return 'y';
+ case 'Z':
+ return 'z';
+ default:
+ return ch;
+ }
}
-LIBC_INLINE static constexpr bool islower(unsigned ch) {
- return (ch - 'a') < 26;
+LIBC_INLINE static constexpr int toupper(int ch) {
+ switch (ch) {
+ case 'a':
+ return 'A';
+ case 'b':
+ return 'B';
+ case 'c':
+ return 'C';
+ case 'd':
+ return 'D';
+ case 'e':
+ return 'E';
+ case 'f':
+ return 'F';
+ case 'g':
+ return 'G';
+ case 'h':
+ return 'H';
+ case 'i':
+ return 'I';
+ case 'j':
+ return 'J';
+ case 'k':
+ return 'K';
+ case 'l':
+ return 'L';
+ case 'm':
+ return 'M';
+ case 'n':
+ return 'N';
+ case 'o':
+ return 'O';
+ case 'p':
+ return 'P';
+ case 'q':
+ return 'Q';
+ case 'r':
+ return 'R';
+ case 's':
+ return 'S';
+ case 't':
+ return 'T';
+ case 'u':
+ return 'U';
+ case 'v':
+ return 'V';
+ case 'w':
+ return 'W';
+ case 'x':
+ return 'X';
+ case 'y':
+ return 'Y';
+ case 'z':
+ return 'Z';
+ default:
+ return ch;
+ }
}
-LIBC_INLINE static constexpr bool isupper(unsigned ch) {
- return (ch - 'A') < 26;
+LIBC_INLINE static constexpr bool isalpha(int ch) {
+ switch (ch) {
+ case 'a':
+ case 'b':
+ case 'c':
+ case 'd':
+ case 'e':
+ case 'f':
+ case 'g':
+ case 'h':
+ case 'i':
+ case 'j':
+ case 'k':
+ case 'l':
+ case 'm':
+ case 'n':
+ case 'o':
+ case 'p':
+ case 'q':
+ case 'r':
+ case 's':
+ case 't':
+ case 'u':
+ case 'v':
+ case 'w':
+ case 'x':
+ case 'y':
+ case 'z':
+ case 'A':
+ case 'B':
+ case 'C':
+ case 'D':
+ case 'E':
+ case 'F':
+ case 'G':
+ case 'H':
+ case 'I':
+ case 'J':
+ case 'K':
+ case 'L':
+ case 'M':
+ case 'N':
+ case 'O':
+ case 'P':
+ case 'Q':
+ case 'R':
+ case 'S':
+ case 'T':
+ case 'U':
+ case 'V':
+ case 'W':
+ case 'X':
+ case 'Y':
+ case 'Z':
+ return true;
+ default:
+ return false;
+ }
}
-LIBC_INLINE static constexpr bool isspace(unsigned ch) {
- return ch == ' ' || (ch - '\t') < 5;
+LIBC_INLINE static constexpr bool isalnum(int ch) {
+ switch (ch) {
+ case 'a':
+ case 'b':
+ case 'c':
+ case 'd':
+ case 'e':
+ case 'f':
+ case 'g':
+ case 'h':
+ case 'i':
+ case 'j':
+ case 'k':
+ case 'l':
+ case 'm':
+ case 'n':
+ case 'o':
+ case 'p':
+ case 'q':
+ case 'r':
+ case 's':
+ case 't':
+ case 'u':
+ case 'v':
+ case 'w':
+ case 'x':
+ case 'y':
+ case 'z':
+ case 'A':
+ case 'B':
+ case 'C':
+ case 'D':
+ case 'E':
+ case 'F':
+ case 'G':
+ case 'H':
+ case 'I':
+ case 'J':
+ case 'K':
+ case 'L':
+ case 'M':
+ case 'N':
+ case 'O':
+ case 'P':
+ case 'Q':
+ case 'R':
+ case 'S':
+ case 'T':
+ case 'U':
+ case 'V':
+ case 'W':
+ case 'X':
+ case 'Y':
+ case 'Z':
+ case '0':
+ case '1':
+ case '2':
+ case '3':
+ case '4':
+ case '5':
+ case '6':
+ case '7':
+ case '8':
+ case '9':
+ return true;
+ default:
+ return false;
+ }
}
-LIBC_INLINE static constexpr int tolower(int ch) {
- if (isupper(ch))
- return ch + ('a' - 'A');
- return ch;
+LIBC_INLINE static constexpr int b36_char_to_int(int ch) {
+ switch (ch) {
+ case '0':
+ return 0;
+ case '1':
+ return 1;
+ case '2':
+ return 2;
+ case '3':
+ return 3;
+ case '4':
+ return 4;
+ case '5':
+ return 5;
+ case '6':
+ return 6;
+ case '7':
+ return 7;
+ case '8':
+ return 8;
+ case '9':
+ return 9;
+ case 'a':
+ case 'A':
+ return 10;
+ case 'b':
+ case 'B':
+ return 11;
+ case 'c':
+ case 'C':
+ return 12;
+ case 'd':
+ case 'D':
+ return 13;
+ case 'e':
+ case 'E':
+ return 14;
+ case 'f':
+ case 'F':
+ return 15;
+ case 'g':
+ case 'G':
+ return 16;
+ case 'h':
+ case 'H':
+ return 17;
+ case 'i':
+ case 'I':
+ return 18;
+ case 'j':
+ case 'J':
+ return 19;
+ case 'k':
+ case 'K':
+ return 20;
+ case 'l':
+ case 'L':
+ return 21;
+ case 'm':
+ case 'M':
+ return 22;
+ case 'n':
+ case 'N':
+ return 23;
+ case 'o':
+ case 'O':
+ return 24;
+ case 'p':
+ case 'P':
+ return 25;
+ case 'q':
+ case 'Q':
+ return 26;
+ case 'r':
+ case 'R':
+ return 27;
+ case 's':
+ case 'S':
+ return 28;
+ case 't':
+ case 'T':
+ return 29;
+ case 'u':
+ case 'U':
+ return 30;
+ case 'v':
+ case 'V':
+ return 31;
+ case 'w':
+ case 'W':
+ return 32;
+ case 'x':
+ case 'X':
+ return 33;
+ case 'y':
+ case 'Y':
+ return 34;
+ case 'z':
+ case 'Z':
+ return 35;
+ default:
+ return 0;
+ }
+}
+
+LIBC_INLINE static constexpr int int_to_b36_char(int num) {
+ // Can't actually use LIBC_ASSERT here because it depends on integer_to_string
+ // which depends on this.
+
+ // LIBC_ASSERT(num < 36);
+ switch (num) {
+ case 0:
+ return '0';
+ case 1:
+ return '1';
+ case 2:
+ return '2';
+ case 3:
+ return '3';
+ case 4:
+ return '4';
+ case 5:
+ return '5';
+ case 6:
+ return '6';
+ case 7:
+ return '7';
+ case 8:
+ return '8';
+ case 9:
+ return '9';
+ case 10:
+ return 'a';
+ case 11:
+ return 'b';
+ case 12:
+ return 'c';
+ case 13:
+ return 'd';
+ case 14:
+ return 'e';
+ case 15:
+ return 'f';
+ case 16:
+ return 'g';
+ case 17:
+ return 'h';
+ case 18:
+ return 'i';
+ case 19:
+ return 'j';
+ case 20:
+ return 'k';
+ case 21:
+ return 'l';
+ case 22:
+ return 'm';
+ case 23:
+ return 'n';
+ case 24:
+ return 'o';
+ case 25:
+ return 'p';
+ case 26:
+ return 'q';
+ case 27:
+ return 'r';
+ case 28:
+ return 's';
+ case 29:
+ return 't';
+ case 30:
+ return 'u';
+ case 31:
+ return 'v';
+ case 32:
+ return 'w';
+ case 33:
+ return 'x';
+ case 34:
+ return 'y';
+ case 35:
+ return 'z';
+ default:
+ return '!';
+ }
+}
+
+LIBC_INLINE static constexpr bool isspace(int ch) {
+ switch (ch) {
+ case ' ':
+ case '\t':
+ case '\n':
+ case '\v':
+ case '\f':
+ case '\r':
+ return true;
+ default:
+ return false;
+ }
+}
+
+// not yet encoding independent.
+LIBC_INLINE static constexpr bool isgraph(int ch) {
+ return 0x20 < ch && ch < 0x7f;
}
} // namespace internal
diff --git a/libc/src/__support/integer_literals.h b/libc/src/__support/integer_literals.h
index 4c5c4c41666811..0298ec7d088d69 100644
--- a/libc/src/__support/integer_literals.h
+++ b/libc/src/__support/integer_literals.h
@@ -13,12 +13,13 @@
#ifndef LLVM_LIBC_SRC___SUPPORT_INTEGER_LITERALS_H
#define LLVM_LIBC_SRC___SUPPORT_INTEGER_LITERALS_H
-#include "src/__support/CPP/limits.h" // CHAR_BIT
+#include "src/__support/CPP/limits.h" // CHAR_BIT
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/attributes.h" // LIBC_INLINE
#include "src/__support/macros/config.h"
-#include "src/__support/uint128.h" // UInt128
-#include <stddef.h> // size_t
-#include <stdint.h> // uintxx_t
+#include "src/__support/uint128.h" // UInt128
+#include <stddef.h> // size_t
+#include <stdint.h> // uintxx_t
namespace LIBC_NAMESPACE_DECL {
@@ -75,26 +76,13 @@ template <typename T, int base> struct DigitBuffer {
push(*str);
}
- // Returns the digit for a particular character.
- // Returns INVALID_DIGIT if the character is invalid.
- LIBC_INLINE static constexpr uint8_t get_digit_value(const char c) {
- const auto to_lower = [](char c) { return c | 32; };
- const auto is_digit = [](char c) { return c >= '0' && c <= '9'; };
- const auto is_alpha = [](char c) {
- return ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z');
- };
- if (is_digit(c))
- return static_cast<uint8_t>(c - '0');
- if (base > 10 && is_alpha(c))
- return static_cast<uint8_t>(to_lower(c) - 'a' + 10);
- return INVALID_DIGIT;
- }
-
// Adds a single character to this buffer.
LIBC_INLINE constexpr void push(char c) {
if (c == '\'')
return; // ' is valid but not taken into account.
- const uint8_t value = get_digit_value(c);
+ const int b36_val = internal::b36_char_to_int(c);
+ const uint8_t value = static_cast<uint8_t>(
+ b36_val < base && (b36_val != 0 || c == '0') ? b36_val : INVALID_DIGIT);
if (value == INVALID_DIGIT || size >= MAX_DIGITS) {
// During constant evaluation `__builtin_unreachable` will halt the
// compiler as it is not executable. This is preferable over `assert` that
diff --git a/libc/src/ctype/isxdigit.cpp b/libc/src/ctype/isxdigit.cpp
index 6b730c354db083..81f645c6f49fc9 100644
--- a/libc/src/ctype/isxdigit.cpp
+++ b/libc/src/ctype/isxdigit.cpp
@@ -16,7 +16,8 @@ namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isxdigit, (int c)) {
const unsigned ch = static_cast<unsigned>(c);
- return static_cast<int>(internal::isdigit(ch) || (ch | 32) - 'a' < 6);
+ return static_cast<int>(internal::isalnum(ch) &&
+ internal::b36_char_to_int(ch) < 16);
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isxdigit_l.cpp b/libc/src/ctype/isxdigit_l.cpp
index 8a5c7d4d28ab1c..eddfd20a2da3b4 100644
--- a/libc/src/ctype/isxdigit_l.cpp
+++ b/libc/src/ctype/isxdigit_l.cpp
@@ -16,7 +16,8 @@ namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isxdigit_l, (int c, locale_t)) {
const unsigned ch = static_cast<unsigned>(c);
- return static_cast<int>(internal::isdigit(ch) || (ch | 32) - 'a' < 6);
+ return static_cast<int>(internal::isalnum(ch) &&
+ internal::b36_char_to_int(ch) < 16);
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/stdio/printf_core/fixed_converter.h b/libc/src/stdio/printf_core/fixed_converter.h
index c8812d77b62e34..ba0a62d9fcb87f 100644
--- a/libc/src/stdio/printf_core/fixed_converter.h
+++ b/libc/src/stdio/printf_core/fixed_converter.h
@@ -11,6 +11,7 @@
#include "include/llvm-libc-macros/stdfix-macros.h"
#include "src/__support/CPP/string_view.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/fixed_point/fx_bits.h"
#include "src/__support/fixed_point/fx_rep.h"
#include "src/__support/integer_to_string.h"
@@ -68,10 +69,6 @@ LIBC_INLINE int convert_fixed(Writer *writer, const FormatSection &to_conv) {
using LARep = fixed_point::FXRep<unsigned long accum>;
using StorageType = LARep::StorageType;
- // All of the letters will be defined relative to variable a, which will be
- // the appropriate case based on the name of the conversion. This converts any
- // conversion name into the letter 'a' with the appropriate case.
- const char a = (to_conv.conv_name & 32) | 'A';
FormatFlags flags = to_conv.flags;
bool is_negative;
@@ -179,9 +176,9 @@ LIBC_INLINE int convert_fixed(Writer *writer, const FormatSection &to_conv) {
// unspecified.
RoundDirection round;
char first_digit_after = fraction_digits[precision];
- if (first_digit_after > '5') {
+ if (internal::b36_char_to_int(first_digit_after) > 5) {
round = RoundDirection::Up;
- } else if (first_digit_after < '5') {
+ } else if (internal::b36_char_to_int(first_digit_after) < 5) {
round = RoundDirection::Down;
} else {
// first_digit_after == '5'
@@ -204,7 +201,8 @@ LIBC_INLINE int convert_fixed(Writer *writer, const FormatSection &to_conv) {
keep_rounding = false;
char cur_digit = fraction_digits[digit_to_round];
// if the digit should not be rounded up
- if (round == RoundDirection::Even && ((cur_digit - '0') % 2) == 0) {
+ if (round == RoundDirection::Even &&
+ (internal::b36_char_to_int(cur_digit) % 2) == 0) {
// break out of the loop
break;
}
@@ -246,7 +244,7 @@ LIBC_INLINE int convert_fixed(Writer *writer, const FormatSection &to_conv) {
char sign_char = 0;
// Check if the conv name is uppercase
- if (a == 'A') {
+ if (internal::isupper(to_conv.conv_name)) {
// These flags are only for signed conversions, so this removes them if the
// conversion is unsigned.
flags = FormatFlags(flags &
diff --git a/libc/src/stdio/printf_core/float_dec_converter.h b/libc/src/stdio/printf_core/float_dec_converter.h
index e39ba6ecea8d48..d93457fcafd7f9 100644
--- a/libc/src/stdio/printf_core/float_dec_converter.h
+++ b/libc/src/stdio/printf_core/float_dec_converter.h
@@ -13,6 +13,7 @@
#include "src/__support/FPUtil/FPBits.h"
#include "src/__support/FPUtil/rounding_mode.h"
#include "src/__support/big_int.h" // is_big_int_v
+#include "src/__support/ctype_utils.h"
#include "src/__support/float_to_string.h"
#include "src/__support/integer_to_string.h"
#include "src/__support/libc_assert.h"
@@ -587,8 +588,6 @@ LIBC_INLINE int convert_float_dec_exp_typed(Writer *writer,
int exponent = float_bits.get_explicit_exponent();
StorageType mantissa = float_bits.get_explicit_mantissa();
- const char a = (to_conv.conv_name & 32) | 'A';
-
char sign_char = 0;
if (float_bits.is_neg())
@@ -734,7 +733,8 @@ LIBC_INLINE int convert_float_dec_exp_typed(Writer *writer,
round = get_round_direction(last_digit, truncated, float_bits.sign());
RET_IF_RESULT_NEGATIVE(float_writer.write_last_block(
- digits, maximum, round, final_exponent, a + 'E' - 'A'));
+ digits, maximum, round, final_exponent,
+ internal::islower(to_conv.conv_name) ? 'e' : 'E'));
RET_IF_RESULT_NEGATIVE(float_writer.right_pad());
return WRITE_OK;
diff --git a/libc/src/stdio/printf_core/float_hex_converter.h b/libc/src/stdio/printf_core/float_hex_converter.h
index 0b3ff3dd1cbfdc..5d9c42882a589b 100644
--- a/libc/src/stdio/printf_core/float_hex_converter.h
+++ b/libc/src/stdio/printf_core/float_hex_converter.h
@@ -12,6 +12,7 @@
#include "src/__support/CPP/string_view.h"
#include "src/__support/FPUtil/FPBits.h"
#include "src/__support/FPUtil/rounding_mode.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
#include "src/stdio/printf_core/converter_utils.h"
#include "src/stdio/printf_core/core_structs.h"
@@ -31,7 +32,6 @@ LIBC_INLINE int convert_float_hex_exp(Writer *writer,
// All of the letters will be defined relative to variable a, which will be
// the appropriate case based on the name of the conversion. This converts any
// conversion name into the letter 'a' with the appropriate case.
- const char a = (to_conv.conv_name & 32) | 'A';
bool is_negative;
int exponent;
@@ -138,9 +138,11 @@ LIBC_INLINE int convert_float_hex_exp(Writer *writer,
size_t mant_cur = mant_len;
size_t first_non_zero = 1;
for (; mant_cur > 0; --mant_cur, mantissa >>= 4) {
- char mant_mod_16 = static_cast<char>(mantissa) & 15;
- char new_digit = static_cast<char>(
- (mant_mod_16 > 9) ? (mant_mod_16 - 10 + a) : (mant_mod_16 + '0'));
+ char mant_mod_16 = static_cast<char>(mantissa % 16);
+ char new_digit = static_cast<char>(internal::int_to_b36_char(mant_mod_16));
+ if (internal::isupper(to_conv.conv_name)) {
+ new_digit = static_cast<char>(internal::toupper(new_digit));
+ }
mant_buffer[mant_cur - 1] = new_digit;
if (new_digit != '0' && first_non_zero < mant_cur)
first_non_zero = mant_cur;
@@ -168,7 +170,8 @@ LIBC_INLINE int convert_float_hex_exp(Writer *writer,
size_t exp_cur = EXP_LEN;
for (; exponent > 0; --exp_cur, exponent /= 10) {
- exp_buffer[exp_cur - 1] = static_cast<char>((exponent % 10) + '0');
+ exp_buffer[exp_cur - 1] =
+ static_cast<char>(internal::int_to_b36_char(exponent % 10));
}
if (exp_cur == EXP_LEN) { // if nothing else was written, write a 0.
exp_buffer[EXP_LEN - 1] = '0';
@@ -187,7 +190,7 @@ LIBC_INLINE int convert_float_hex_exp(Writer *writer,
constexpr size_t PREFIX_LEN = 2;
char prefix[PREFIX_LEN];
prefix[0] = '0';
- prefix[1] = a + ('x' - 'a');
+ prefix[1] = internal::islower(to_conv.conv_name) ? 'x' : 'X';
const cpp::string_view prefix_str(prefix, PREFIX_LEN);
// If the precision is greater than the actual result, pad with 0s
@@ -200,7 +203,7 @@ LIBC_INLINE int convert_float_hex_exp(Writer *writer,
constexpr cpp::string_view HEXADECIMAL_POINT(".");
// This is for the letter 'p' before the exponent.
- const char exp_separator = a + ('p' - 'a');
+ const char exp_separator = internal::islower(to_conv.conv_name) ? 'p' : 'P';
constexpr int EXP_SEPARATOR_LEN = 1;
padding = static_cast<int>(to_conv.min_width - (sign_char > 0 ? 1 : 0) -
diff --git a/libc/src/stdio/printf_core/float_inf_nan_converter.h b/libc/src/stdio/printf_core/float_inf_nan_converter.h
index a7da682b835bee..3e41612e21c9fc 100644
--- a/libc/src/stdio/printf_core/float_inf_nan_converter.h
+++ b/libc/src/stdio/printf_...
[truncated]
|
A bunch of string conversion/testing code needed to be updated. I'm gonna need to clean it up more later but that's gonna be a followup. Also I need to set up the ctype function in bazel. Blarg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The changes in this PR fixed a lot of failing tests on z/OS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits, but LGTM
const char result = static_cast<char>(internal::int_to_b36_char(digit)); | ||
return static_cast<char>(Fmt::IS_UPPERCASE ? internal::toupper(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there's an extra cast in here, that might not be necessary?
const char result = static_cast<char>(internal::int_to_b36_char(digit)); | |
return static_cast<char>(Fmt::IS_UPPERCASE ? internal::toupper(result) | |
const int result = internal::int_to_b36_char(digit); | |
return static_cast<char>(Fmt::IS_UPPERCASE ? internal::toupper(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int_to_b36_char
returns an int to match with the other ctype functions. Without the cast you get a warning for loss of precision.
The previous implementation of the ctype functions assumed ASCII.
This patch changes to a switch/case implementation that looks odd, but
actually is easier for the compiler to understand and optimize.