-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-108767: Convert Py_ISLOWER() macro to static inline function #108770
Conversation
@erlend-aasland: A bunch of PEP 670 candies :-) |
This generates |
It reminds me PyUnicode_Kind(obj) which gave me headaches (and then nightmakes!). It ended up by ... adding a comment to explain to me why I didn't convert it :-) // PyUnicode_KIND(): Return one of the PyUnicode_*_KIND values defined above.
//
// gh-89653: Converting this macro to a static inline function would introduce
// new compiler warnings on "kind < PyUnicode_KIND(str)" (compare signed and
// unsigned numbers) where kind type is an int or on
// "unsigned int kind = PyUnicode_KIND(str)" (cast signed to unsigned).
#define PyUnicode_KIND(op) _Py_RVALUE(_PyASCIIObject_CAST(op)->state.kind) |
I confirm that building Python with
Tested: GCC 13.2.1 on Fedora 38. |
/* Argument must be a char or an int in [-128, 127] or [0, 255]. */
#define Py_CHARMASK(c) ((unsigned char)((c) & 0xff)) The must word is scary! |
I do not think this should be done. The new code is larger and much less readable (to me). And it creates new problems to fight! |
Modules/_sre/sre.c
Outdated
@@ -106,7 +106,7 @@ static const char copyright[] = | |||
#define SRE_IS_WORD(ch)\ | |||
((ch) <= 'z' && (Py_ISALNUM(ch) || (ch) == '_')) | |||
|
|||
static unsigned int sre_lower_ascii(unsigned int ch) | |||
static unsigned int sre_lower_ascii(int ch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This micro bugfix should get its own issue and PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the current code is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll let you and Victor come to agreement on this.
@erlend-aasland: Do you want to convert these macros? Or do you agree with @serhiy-storchaka that the change is bad since it introduces new compiler warnings, especially when the functions are "misused"? These changes give a better defined API for these functions. They have the same API that their original #include <ctype.h>
int isalnum(int c);
int isalpha(int c);
int iscntrl(int c);
int isdigit(int c);
int isgraph(int c);
int islower(int c);
int isprint(int c);
int ispunct(int c);
int isspace(int c);
int isupper(int c);
int isxdigit(int c);
int isascii(int c);
int isblank(int c);
int toupper(int c);
int tolower(int c);
And yes, if you passed an |
23eb55f
to
b6f958b
Compare
I rebased my PR, change |
Please review the updated PR. |
There's still compiler warnings ;) Since these macros are exposed in Python.h, this change may also introduce new compiler warnings in external projects, so I'm not sure it is worth doing. There is no macro pitfalls here, so it's not an imperative change IMO. |
Convert the following macros to static inline functions: * Py_ISLOWER() * Py_ISUPPER() * Py_ISALPHA() * Py_ISDIGIT() * Py_ISXDIGIT() * Py_ISALNUM() * Py_ISSPACE() * Py_TOLOWER() * Py_TOUPPER() * Py_CHARMASK() Changes: * sre_lower_ascii() now casts Py_TOLOWER() argument to "unsigned char" and cast the result to "unsigned int". * bytesobject.c and bytearrayobject.c now pass an "int" argument to Py_CHARMASK(), instead of a "Py_ssize_t" argument.
d54e1ab
to
9feff13
Compare
Oh! I missed that one. I fixed it and rebased my PR to the main branch (since I made other changes in main related to these functions, see commit 03c4080). |
I don't mind converting these functions (I prefer static inline functions to macros in most cases anyway). I also agree that it is better to have a well-defined API (typed inputs and outputs instead of whatever goes in probably comes out). However, this may introduce compiler warnings in third-party code. I'm +0.5 :) |
First, I have a problem with reading a sequence of several almost identical fragments of code. So the new code is unreadable to me. But this is my personal and minor issue. I do not expect to read this code often. It should not stop you. Second, I do not see anything wrong with the current working code and do not understand why it should be changed, with the risk of introducing new bugs or fight new battles with compiler warnings. Macros are not an evil that should be avoided at all costs. In some cases, it is a suitable tool, and sometimes it is the only working tool. I do not actively object to these changes. I only explain my point of view. You may change your mind if you look at it from a different perspective if you too had doubts. |
My concern is not avoid "Macro Pitfalls" but the API. I have an different point of view: the current API is not well defined. It's unclear what is the argument type: is it signed? signed? is int accepted, or also long? Same question for the result. Is it signed or unsigned? Is it a char or an int? You see the introduction of new compiler warnings as a regression. I see it as an intentional change, since the API was weakly defined (weakly "typed") :-) If argument types and return type are well defined, there are less "undefined behaviors" depending on the types chosen by the C compiler depending how and where the macros/functions are used. For me, it's like |
Also, the new code may be less efficient, due to forcing conversion to But I don't want to argue about it, so far I'm just -0. |
Thanks for the constructive discussion. I'm not going to push this change further. Ok, it's not worth it ;-) |
Convert the following macros to static inline functions: