Skip to content

bpo-44914: Maintain invariants of type version tags. #27773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Class version tags are no longer recycled.

This means that a version tag serves as a unique identifier for the state of
a class. We rely on this for effective specialization of the LOAD_ATTR and
other instructions.
40 changes: 13 additions & 27 deletions Objects/typeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ class object "PyObject *" "&PyBaseObject_Type"

// bpo-42745: next_version_tag remains shared by all interpreters because of static types
// Used to set PyTypeObject.tp_version_tag
static unsigned int next_version_tag = 0;
static unsigned int next_version_tag = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth it to change PyTypeObject.tp_version_tag type to uint64_t to reduce the risk of overflow?

2^64 version changes is less likely than 2^32 changes.

Copy link
Member

@Fidget-Spinner Fidget-Spinner Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyType_ClearCache is documented to return unsigned int, not uint64_t, so I don't know if we can change that without deprecation. https://docs.python.org/3/c-api/type.html#c.PyType_ClearCache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 bit versions would make the type cache twice the size and might well make performance worse.

Classes don't change that much. If it happens that long running programs use 2**32 versions, then it is likely that a small set of classes change often. In that case we could limit the number of version that an individual class can use up.


typedef struct PySlot_Offset {
short subslot_offset;
Expand Down Expand Up @@ -233,24 +233,14 @@ get_type_cache(void)


static void
type_cache_clear(struct type_cache *cache, int use_none)
type_cache_clear(struct type_cache *cache, PyObject *value)
{
for (Py_ssize_t i = 0; i < (1 << MCACHE_SIZE_EXP); i++) {
struct type_cache_entry *entry = &cache->hashtable[i];
entry->version = 0;
if (use_none) {
// Set to None so _PyType_Lookup() can use Py_SETREF(),
// rather than using slower Py_XSETREF().
Py_XSETREF(entry->name, Py_NewRef(Py_None));
}
else {
Py_CLEAR(entry->name);
}
Py_XSETREF(entry->name, _Py_XNewRef(value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remark: _Py_XNewRef() should not be used, you should use the public Py_XNewRef() instead.

entry->value = NULL;
}

// Mark all version tags as invalid
PyType_Modified(&PyBaseObject_Type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm confused as to why we don't need this anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't reuse the version numbers anymore, so the old ones remain valid.

}


Expand Down Expand Up @@ -287,14 +277,11 @@ _PyType_ClearCache(PyInterpreterState *interp)
sizeof(cache->hashtable) / 1024);
#endif

unsigned int cur_version_tag = next_version_tag - 1;
if (_Py_IsMainInterpreter(interp)) {
next_version_tag = 0;
}
// Set to None, rather than NULL, so _PyType_Lookup() can
// use Py_SETREF() rather than using slower Py_XSETREF().
type_cache_clear(cache, Py_None);

type_cache_clear(cache, 0);

return cur_version_tag;
return next_version_tag - 1;
}


Expand All @@ -309,7 +296,8 @@ PyType_ClearCache(void)
void
_PyType_Fini(PyInterpreterState *interp)
{
_PyType_ClearCache(interp);
struct type_cache *cache = &interp->type_cache;
type_cache_clear(cache, NULL);
if (_Py_IsMainInterpreter(interp)) {
clear_slotdefs();
}
Expand Down Expand Up @@ -426,14 +414,12 @@ assign_version_tag(struct type_cache *cache, PyTypeObject *type)
if (!_PyType_HasFeature(type, Py_TPFLAGS_READY))
return 0;

type->tp_version_tag = next_version_tag++;
/* for stress-testing: next_version_tag &= 0xFF; */

if (type->tp_version_tag == 0) {
// Wrap-around or just starting Python - clear the whole cache
type_cache_clear(cache, 1);
if (next_version_tag == 0) {
/* We have run out of version numbers */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate the comment to explain what happen on next_version_tag overflow? I understand that old types keep their caches, but types which are are changed are no longer cached.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought which I haven't verified:
Why not call PyType_Modified on base object type and clear cache on overflow, then continue incrementing version tag counter again?

That way even after overflow, new version tags are assigned. Just that every UINT32_TMAX type cache miss event will cause temporarily bad performance. But that seems worth it vs bad performance all the way after we hit UINT32_TMAX.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate the comment to explain what happen on next_version_tag overflow? I understand that old types keep their caches, but types which are are changed are no longer cached.

It doesn't overflow. Once next_version_tag reaches 0, we stop issuing new versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought which I haven't verified:
Why not call PyType_Modified on base object type and clear cache on overflow, then continue incrementing version tag counter again?

That way even after overflow, new version tags are assigned. Just that every UINT32_TMAX type cache miss event will cause temporarily bad performance. But that seems worth it vs bad performance all the way after we hit UINT32_TMAX.

We would also need to find and clear every specialized instruction that uses the version tag. It is possible, but I don't think it is worth it. 2**32 is a large number.

return 0;
}
type->tp_version_tag = next_version_tag++;
assert (type->tp_version_tag != 0);

bases = type->tp_bases;
n = PyTuple_GET_SIZE(bases);
Expand Down