Skip to content

Commit

Permalink
C2x strtol binary constant handling
Browse files Browse the repository at this point in the history
C2x adds binary integer constants starting with 0b or 0B, and supports
those constants in strtol-family functions when the base passed is 0
or 2.  Implement that strtol support for glibc.

As discussed at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
this is incompatible with previous C standard versions, in that such
an input string starting with 0b or 0B was previously required to be
parsed as 0 (with the rest of the string unprocessed).  Thus, as
proposed there, this patch adds 20 new __isoc23_* functions with
appropriate header redirection support.  This patch does *not* do
anything about scanf %i (which will need 12 new functions per long
double variant, so 12, 24 or 36 depending on the glibc configuration),
instead leaving that for a future patch.  The function names would
remain as __isoc23_* even if C2x ends up published in 2024 rather than
2023.

Making this change leads to the question of what should happen to
internal uses of these functions in glibc and its tests.  The header
redirection (which applies for _GNU_SOURCE or any other feature test
macros enabling C2x features) has the effect of redirecting internal
uses but without those uses then ending up at a hidden alias (see the
comment in include/stdio.h about interaction with libc_hidden_proto).
It seems desirable for the default for internal uses to be the same
versions used by normal code using _GNU_SOURCE, so rather than doing
anything to disable that redirection, similar macro definitions to
those in include/stdio.h are added to the include/ headers for the new
functions.

Given that the default for uses in glibc is for the redirections to
apply, the next question is whether the C2x semantics are correct for
all those uses.  Uses with the base fixed to 10, 16 or any other value
other than 0 or 2 can be ignored.  I think this leaves the following
internal uses to consider (an important consideration for review of
this patch will be both whether this list is complete and whether my
conclusions on all entries in it are correct):

benchtests/bench-malloc-simple.c
benchtests/bench-string.h
elf/sotruss-lib.c
math/libm-test-support.c
nptl/perf.c
nscd/nscd_conf.c
nss/nss_files/files-parse.c
posix/tst-fnmatch.c
posix/wordexp.c
resolv/inet_addr.c
rt/tst-mqueue7.c
soft-fp/testit.c
stdlib/fmtmsg.c
support/support_test_main.c
support/test-container.c
sysdeps/pthread/tst-mutex10.c

I think all of these places are OK with the new semantics, except for
resolv/inet_addr.c, where the POSIX semantics of inet_addr do not
allow for binary constants; thus, I changed that file (to use
__strtoul_internal, whose semantics are unchanged) and added a test
for this case.  In the case of posix/wordexp.c I think accepting
binary constants is OK since POSIX explicitly allows additional forms
of shell arithmetic expressions, and in stdlib/fmtmsg.c SEV_LEVEL is
not in POSIX so again I think accepting binary constants is OK.

Functions such as __strtol_internal, which are only exported for
compatibility with old binaries from when those were used in inline
functions in headers, have unchanged semantics; the __*_l_internal
versions (purely internal to libc and not exported) have a new
argument to specify whether to accept binary constants.

As well as for the standard functions, the header redirection also
applies to the *_l versions (GNU extensions), and to legacy functions
such as strtoq, to avoid confusing inconsistency (the *q functions
redirect to __isoc23_*ll rather than needing their own __isoc23_*
entry points).  For the functions that are only declared with
_GNU_SOURCE, this means the old versions are no longer available for
normal user programs at all.  An internal __GLIBC_USE_C2X_STRTOL macro
is used to control the redirections in the headers, and cases in glibc
that wish to avoid the redirections - the function implementations
themselves and the tests of the old versions of the GNU functions -
then undefine and redefine that macro to allow the old versions to be
accessed.  (There would of course be greater complexity should we wish
to make any of the old versions into compat symbols / avoid them being
defined at all for new glibc ABIs.)

strtol_l.c has some similarity to strtol.c in gnulib, but has already
diverged some way (and isn't listed at all at
https://sourceware.org/glibc/wiki/SharedSourceFiles unlike strtoll.c
and strtoul.c); I haven't made any attempts at gnulib compatibility in
the changes to that file.

I note incidentally that inttypes.h and wchar.h are missing the
__nonnull present on declarations of this family of functions in
stdlib.h; I didn't make any changes in that regard for the new
declarations added.
  • Loading branch information
jsm28 committed Feb 16, 2023
1 parent 4738bc2 commit 6492442
Show file tree
Hide file tree
Showing 84 changed files with 1,683 additions and 24 deletions.
7 changes: 6 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ Version 2.38

Major new features:

[Add new features here]
* When C2X features are enabled and the base argument is 0 or 2, the
following functions support binary integers prefixed by 0b or 0B as
input: strtol, strtoll, strtoul, strtoull, strtol_l, strtoll_l,
strtoul_l, strtoull_l, strtoimax, strtoumax, strtoq, strtouq, wcstol,
wcstoll, wcstoul, wcstoull, wcstol_l, wcstoll_l, wcstoul_l,
wcstoull_l, wcstoimax, wcstoumax, wcstoq, wcstouq.

Deprecated and removed features, and other changes affecting compatibility:

Expand Down
12 changes: 12 additions & 0 deletions include/features.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@
#undef __GLIBC_USE_ISOC2X
#undef __GLIBC_USE_DEPRECATED_GETS
#undef __GLIBC_USE_DEPRECATED_SCANF
#undef __GLIBC_USE_C2X_STRTOL

/* Suppress kernel-name space pollution unless user expressedly asks
for it. */
Expand Down Expand Up @@ -464,6 +465,17 @@
# define __GLIBC_USE_DEPRECATED_SCANF 0
#endif

/* ISO C2X added support for a 0b or 0B prefix on binary constants as
inputs to strtol-family functions (base 0 or 2). This macro is
used to condition redirection in headers to allow that redirection
to be disabled when building those functions, despite _GNU_SOURCE
being defined. */
#if __GLIBC_USE (ISOC2X)
# define __GLIBC_USE_C2X_STRTOL 1
#else
# define __GLIBC_USE_C2X_STRTOL 0
#endif

/* Get definitions of __STDC_* predefined macros, if the compiler has
not preincluded this header automatically. */
#include <stdc-predef.h>
Expand Down
46 changes: 44 additions & 2 deletions include/stdlib.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#ifndef _STDLIB_H

#ifndef _ISOMAC
# include <stdbool.h>
# include <stddef.h>
#endif

Expand Down Expand Up @@ -35,6 +36,45 @@ libc_hidden_proto (__strtod_l)
libc_hidden_proto (__strtof_l)
libc_hidden_proto (__strtold_l)

extern __typeof (strtol) __isoc23_strtol __attribute_copy__ (strtol);
extern __typeof (strtoul) __isoc23_strtoul __attribute_copy__ (strtoul);
extern __typeof (strtoll) __isoc23_strtoll __attribute_copy__ (strtoll);
extern __typeof (strtoull) __isoc23_strtoull __attribute_copy__ (strtoull);
extern __typeof (strtol_l) __isoc23_strtol_l __attribute_copy__ (strtol_l);
extern __typeof (strtoul_l) __isoc23_strtoul_l __attribute_copy__ (strtoul_l);
extern __typeof (strtoll_l) __isoc23_strtoll_l __attribute_copy__ (strtoll_l);
extern __typeof (strtoull_l) __isoc23_strtoull_l __attribute_copy__ (strtoull_l);
libc_hidden_proto (__isoc23_strtol)
libc_hidden_proto (__isoc23_strtoul)
libc_hidden_proto (__isoc23_strtoll)
libc_hidden_proto (__isoc23_strtoull)
libc_hidden_proto (__isoc23_strtol_l)
libc_hidden_proto (__isoc23_strtoul_l)
libc_hidden_proto (__isoc23_strtoll_l)
libc_hidden_proto (__isoc23_strtoull_l)

#if __GLIBC_USE (C2X_STRTOL)
/* Redirect internal uses of these functions to the C2X versions; the
redirection in the installed header does not work with
libc_hidden_proto. */
# undef strtol
# define strtol __isoc23_strtol
# undef strtoul
# define strtoul __isoc23_strtoul
# undef strtoll
# define strtoll __isoc23_strtoll
# undef strtoull
# define strtoull __isoc23_strtoull
# undef strtol_l
# define strtol_l __isoc23_strtol_l
# undef strtoul_l
# define strtoul_l __isoc23_strtoul_l
# undef strtoll_l
# define strtoll_l __isoc23_strtoll_l
# undef strtoull_l
# define strtoull_l __isoc23_strtoull_l
#endif

libc_hidden_proto (exit)
libc_hidden_proto (abort)
libc_hidden_proto (getenv)
Expand Down Expand Up @@ -202,23 +242,25 @@ extern long double ____strtold_l_internal (const char *__restrict __nptr,
extern long int ____strtol_l_internal (const char *__restrict __nptr,
char **__restrict __endptr,
int __base, int __group,
locale_t __loc);
bool __bin_cst, locale_t __loc);
extern unsigned long int ____strtoul_l_internal (const char *
__restrict __nptr,
char **__restrict __endptr,
int __base, int __group,
bool __bin_cst,
locale_t __loc);
__extension__
extern long long int ____strtoll_l_internal (const char *__restrict __nptr,
char **__restrict __endptr,
int __base, int __group,
locale_t __loc);
bool __bin_cst, locale_t __loc);
__extension__
extern unsigned long long int ____strtoull_l_internal (const char *
__restrict __nptr,
char **
__restrict __endptr,
int __base, int __group,
bool __bin_cst,
locale_t __loc);

libc_hidden_proto (____strtof_l_internal)
Expand Down
50 changes: 45 additions & 5 deletions include/wchar.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# ifndef _ISOMAC

#include <bits/floatn.h>
#include <stdbool.h>

extern __typeof (wcscasecmp_l) __wcscasecmp_l;
extern __typeof (wcsncasecmp_l) __wcsncasecmp_l;
Expand All @@ -34,6 +35,45 @@ libc_hidden_proto (__wcstof_l)
libc_hidden_proto (__wcstold_l)
libc_hidden_proto (__wcsftime_l)

extern __typeof (wcstol) __isoc23_wcstol __attribute_copy__ (wcstol);
extern __typeof (wcstoul) __isoc23_wcstoul __attribute_copy__ (wcstoul);
extern __typeof (wcstoll) __isoc23_wcstoll __attribute_copy__ (wcstoll);
extern __typeof (wcstoull) __isoc23_wcstoull __attribute_copy__ (wcstoull);
extern __typeof (wcstol_l) __isoc23_wcstol_l __attribute_copy__ (wcstol_l);
extern __typeof (wcstoul_l) __isoc23_wcstoul_l __attribute_copy__ (wcstoul_l);
extern __typeof (wcstoll_l) __isoc23_wcstoll_l __attribute_copy__ (wcstoll_l);
extern __typeof (wcstoull_l) __isoc23_wcstoull_l __attribute_copy__ (wcstoull_l);
libc_hidden_proto (__isoc23_wcstol)
libc_hidden_proto (__isoc23_wcstoul)
libc_hidden_proto (__isoc23_wcstoll)
libc_hidden_proto (__isoc23_wcstoull)
libc_hidden_proto (__isoc23_wcstol_l)
libc_hidden_proto (__isoc23_wcstoul_l)
libc_hidden_proto (__isoc23_wcstoll_l)
libc_hidden_proto (__isoc23_wcstoull_l)

#if __GLIBC_USE (C2X_STRTOL)
/* Redirect internal uses of these functions to the C2X versions; the
redirection in the installed header does not work with
libc_hidden_proto. */
# undef wcstol
# define wcstol __isoc23_wcstol
# undef wcstoul
# define wcstoul __isoc23_wcstoul
# undef wcstoll
# define wcstoll __isoc23_wcstoll
# undef wcstoull
# define wcstoull __isoc23_wcstoull
# undef wcstol_l
# define wcstol_l __isoc23_wcstol_l
# undef wcstoul_l
# define wcstoul_l __isoc23_wcstoul_l
# undef wcstoll_l
# define wcstoll_l __isoc23_wcstoll_l
# undef wcstoull_l
# define wcstoull_l __isoc23_wcstoull_l
#endif


extern double __wcstod_internal (const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr, int __group)
Expand Down Expand Up @@ -63,7 +103,7 @@ extern unsigned long long int __wcstoull_internal (const wchar_t *
int __group) __THROW;
extern unsigned long long int ____wcstoull_l_internal (const wchar_t *,
wchar_t **, int, int,
locale_t);
bool, locale_t);
libc_hidden_proto (__wcstof_internal)
libc_hidden_proto (__wcstod_internal)
libc_hidden_proto (__wcstold_internal)
Expand All @@ -86,17 +126,17 @@ extern double ____wcstod_l_internal (const wchar_t *, wchar_t **, int,
extern long double ____wcstold_l_internal (const wchar_t *, wchar_t **,
int, locale_t) attribute_hidden;
extern long int ____wcstol_l_internal (const wchar_t *, wchar_t **, int,
int, locale_t) attribute_hidden;
int, bool, locale_t) attribute_hidden;
extern unsigned long int ____wcstoul_l_internal (const wchar_t *,
wchar_t **,
int, int, locale_t)
int, int, bool, locale_t)
attribute_hidden;
extern long long int ____wcstoll_l_internal (const wchar_t *, wchar_t **,
int, int, locale_t)
int, int, bool, locale_t)
attribute_hidden;
extern unsigned long long int ____wcstoull_l_internal (const wchar_t *,
wchar_t **, int, int,
locale_t)
bool, locale_t)
attribute_hidden;

#if __HAVE_DISTINCT_FLOAT128
Expand Down
2 changes: 1 addition & 1 deletion inet/inet6_scopeid_pton.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ __inet6_scopeid_pton (const struct in6_addr *address, const char *scope,
char *end;
unsigned long long number
= ____strtoull_l_internal (scope, &end, /*base */ 10, /* group */ 0,
_nl_C_locobj_ptr);
/* bin_cst */ false, _nl_C_locobj_ptr);
if (*end == '\0' && number <= UINT32_MAX)
{
*result = number;
Expand Down
10 changes: 10 additions & 0 deletions locale/Versions
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,16 @@ libc {
wcstoll_l; wcstoul_l; wcstoull_l; wcsxfrm_l; wctype_l;
wctrans_l; nl_langinfo_l;
}
GLIBC_2.38 {
__isoc23_strtol_l;
__isoc23_strtoll_l;
__isoc23_strtoul_l;
__isoc23_strtoull_l;
__isoc23_wcstol_l;
__isoc23_wcstoll_l;
__isoc23_wcstoul_l;
__isoc23_wcstoull_l;
}
GLIBC_PRIVATE {
# global variables
__collate_element_hash; __collate_element_strings;
Expand Down
9 changes: 6 additions & 3 deletions manual/arith.texi
Original file line number Diff line number Diff line change
Expand Up @@ -2656,12 +2656,15 @@ A nonempty sequence of digits in the radix specified by @var{base}.

If @var{base} is zero, decimal radix is assumed unless the series of
digits begins with @samp{0} (specifying octal radix), or @samp{0x} or
@samp{0X} (specifying hexadecimal radix); in other words, the same
syntax used for integer constants in C.
@samp{0X} (specifying hexadecimal radix), or @samp{0b} or @samp{0B}
(specifying binary radix; only supported when C2X features are
enabled); in other words, the same syntax used for integer constants in C.

Otherwise @var{base} must have a value between @code{2} and @code{36}.
If @var{base} is @code{16}, the digits may optionally be preceded by
@samp{0x} or @samp{0X}. If base has no legal value the value returned
@samp{0x} or @samp{0X}. If @var{base} is @code{2}, and C2X features
are enabled, the digits may optionally be preceded by
@samp{0b} or @samp{0B}. If base has no legal value the value returned
is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}.

@item
Expand Down
1 change: 1 addition & 0 deletions resolv/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ routines += gai_sigqueue
tests += \
tst-bug18665 \
tst-bug18665-tcp \
tst-inet_addr-binary \
tst-ns_name \
tst-ns_name_compress \
tst-ns_name_pton \
Expand Down
2 changes: 1 addition & 1 deletion resolv/inet_addr.c
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ inet_aton_end (const char *cp, struct in_addr *addr, const char **endp)
goto ret_0;
{
char *endp;
unsigned long ul = strtoul (cp, &endp, 0);
unsigned long ul = __strtoul_internal (cp, &endp, 0, 0);
if (ul == ULONG_MAX && errno == ERANGE)
goto ret_0;
if (ul > 0xfffffffful)
Expand Down
30 changes: 30 additions & 0 deletions resolv/tst-inet_addr-binary.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/* Test inet_addr does not accept C2X binary constants.
Copyright (C) 2022-2023 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */

#include <arpa/inet.h>

#include <support/check.h>

static int
do_test (void)
{
TEST_COMPARE (inet_addr ("0b101"), (in_addr_t) -1);
return 0;
}

#include <support/test-driver.c>
12 changes: 12 additions & 0 deletions stdlib/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ tests := \
tst-strtod5 \
tst-strtod6 \
tst-strtol \
tst-strtol-binary-c11 \
tst-strtol-binary-c2x \
tst-strtol-binary-gnu11 \
tst-strtol-binary-gnu2x \
tst-strtol-locale \
tst-strtoll \
tst-swapcontext1 \
Expand Down Expand Up @@ -394,6 +398,14 @@ CFLAGS-tst-makecontext2.c += $(stack-align-test-flags)

CFLAGS-testmb.c += -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -Wall -Werror

# Some versions of GCC supported for building glibc do not support -std=c2x
# or -std=gnu2x, so the tests for those versions use -std=c11 and -std=gnu11
# and then _ISOC2X_SOURCE is defined in the test as needed.
CFLAGS-tst-strtol-binary-c11.c += -std=c11
CFLAGS-tst-strtol-binary-c2x.c += -std=c11
CFLAGS-tst-strtol-binary-gnu11.c += -std=gnu11
CFLAGS-tst-strtol-binary-gnu2x.c += -std=gnu11


# Run a test on the header files we use.
tests-special += $(objpfx)isomac.out
Expand Down
8 changes: 8 additions & 0 deletions stdlib/Versions
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,14 @@ libc {
}
GLIBC_2.37 {
}
GLIBC_2.38 {
__isoc23_strtol;
__isoc23_strtoll;
__isoc23_strtoul;
__isoc23_strtoull;
__isoc23_strtoimax;
__isoc23_strtoumax;
}
GLIBC_PRIVATE {
# functions which have an additional interface since they are
# are cancelable.
Expand Down
40 changes: 40 additions & 0 deletions stdlib/inttypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,46 @@ extern uintmax_t wcstoumax (const __gwchar_t *__restrict __nptr,
__gwchar_t ** __restrict __endptr, int __base)
__THROW;

/* Versions of the above functions that handle '0b' and '0B' prefixes
in base 0 or 2. */
#if __GLIBC_USE (C2X_STRTOL)
# ifdef __REDIRECT
extern intmax_t __REDIRECT_NTH (strtoimax, (const char *__restrict __nptr,
char **__restrict __endptr,
int __base), __isoc23_strtoimax);
extern uintmax_t __REDIRECT_NTH (strtoumax, (const char *__restrict __nptr,
char **__restrict __endptr,
int __base), __isoc23_strtoumax);
extern intmax_t __REDIRECT_NTH (wcstoimax,
(const __gwchar_t *__restrict __nptr,
__gwchar_t **__restrict __endptr, int __base),
__isoc23_wcstoimax);
extern uintmax_t __REDIRECT_NTH (wcstoumax,
(const __gwchar_t *__restrict __nptr,
__gwchar_t **__restrict __endptr, int __base),
__isoc23_wcstoumax);
# else
extern intmax_t __isoc23_strtoimax (const char *__restrict __nptr,
char **__restrict __endptr, int __base)
__THROW;
extern uintmax_t __isoc23_strtoumax (const char *__restrict __nptr,
char ** __restrict __endptr, int __base)
__THROW;
extern intmax_t __isoc23_wcstoimax (const __gwchar_t *__restrict __nptr,
__gwchar_t **__restrict __endptr,
int __base)
__THROW;
extern uintmax_t __isoc23_wcstoumax (const __gwchar_t *__restrict __nptr,
__gwchar_t ** __restrict __endptr,
int __base)
__THROW;
# define strtoimax __isoc23_strtoimax
# define strtoumax __isoc23_strtoumax
# define wcstoimax __isoc23_wcstoimax
# define wcstoumax __isoc23_wcstoumax
# endif
#endif

__END_DECLS

#endif /* inttypes.h */
Loading

0 comments on commit 6492442

Please sign in to comment.