bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects.#30928
Merged
ericsnowcurrently merged 117 commits intopython:mainfrom Feb 8, 2022
Merged
Conversation
This was referenced Jan 26, 2022
ericsnowcurrently
added a commit
that referenced
this pull request
Jan 27, 2022
This change is a prerequisite for generating code for other global objects (like strings in gh-30928). (We borrowed some code from Tools/scripts/deepfreeze.py.) https://bugs.python.org/issue46541
3673c6c to
3ab48c8
Compare
ae35362 to
1612a4a
Compare
|
When you're done making the requested changes, leave the comment: |
markshannon
reviewed
Feb 4, 2022
sweeneyde
reviewed
Feb 5, 2022
| PyObject * | ||
| _PyDict_GetItemWithError(PyObject *dp, PyObject *kv) | ||
| { | ||
| assert(PyUnicode_CheckExact(kv)); |
Member
There was a problem hiding this comment.
Just a half-baked idea: would it be possible to initialize all hashes of global strings at interpreter startup, then eliminate some branching with a special
int
_PyDict_GetItemStrWithHashInitialized(PyDictObject *mp, PyUnicodeObject *key)
{
assert(PyDict_CheckExact(mp));
assert(PyUnicode_CheckExact(key));
Py_hash_t hash = ((PyASCIIObject *)key)->hash;
assert(hash != -1);
// Inline _PyDict_GetItem_KnownHash --> _Py_dict_lookup
PyDictKeysObject *dk = mp->ma_keys;
Py_ssize_t ix = dictkeys_stringlookup(dk, key, hash);
if (ix == DKIX_EMPTY) {
return NULL;
}
else if (kind == DICT_KEYS_SPLIT) {
return mp->ma_values->values[ix];
}
else {
return DK_ENTRIES(dk)[ix].me_value;
}
}
Member
There was a problem hiding this comment.
Maybe add this to https://github.com/faster-cpython/ideas/discussions?
Member
Author
|
@markshannon approved this offline. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In this PR we're no longer using
_Py_IDENTIFIER()(or_Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.The replacement is:
PyUnicodeObject(not pointer) fields under_PyRuntimeState, statically initialized as part of_PyRuntime. A new_Py_GET_GLOBAL_IDENTIFIER()macro facilitates lookup of the fields (along with_Py_GET_GLOBAL_STRING()for non-identifier strings).https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
_PyRuntimeStateI've also added a
--checkflag to generate_global_objects.py (along withmake check-global-objects) to check for unused global strings. That check is added to the PR CI config.The remainder of the PR updates the core code to use
_Py_GET_GLOBAL_IDENTIFIER()instead of_Py_IDENTIFIER()and the related_Py*Idfunctions (likewise for_Py_GET_GLOBAL_STRING()instead of_Py_static_string()). This includes adding a few functions where there wasn't already an alternative to_Py*Id(), replacing the_Py_Identifier *parameter withPyObject *.I'm planning on addressing the following separately:
_Py_IDENTIFIER()in the stdlib modules_Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) APIhttps://bugs.python.org/issue46541