Skip to content

Pointer tagging aka tagged numbers #7

Closed
@markshannon

Description

@markshannon

https://en.wikipedia.org/wiki/Tagged_pointer

Currently, a reference to a Python object is a PyObject *. This means that we need to do a lot of boxing, unboxing and pointer chasing when handling ints and floats.
Because ints and floats are pure value types, we can freely replace references to them with their values, and vice-versa.

For 32 bit machines, using a single bit to tag small (up to 31 bit) integers is the most obvious scheme.

Tag Meaning Encoding
0 int (31 bit) val<<1
1 PyObject * (including large ints and all floats) ((intptr_t)val)-1

For 64 bit machines, there at least two schemes that we could use.

  1. NaN-tagging works for machines with up to about 52 bits of VM space, and allows tagging of all floats.
  2. A tagging scheme that allows 64 bit (aligned) pointers and boxes almost all floats, works as follows:
Tag Meaning Encoding
0 int (61 bit) val<<3
1-6 float abs(val) < 2**512 rotate_bits(val, 4)
7 PyObject * (including large ints and large floats) ((intptr_t)val)-1

Any C extensions that use macros to access the internals of tuples, lists, etc. would need recompiling, but they need recompiling for every release anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions