Skip to content

ctypes: Switch field accessors to fixed-width integers #127295

Closed
@encukou

Description

@encukou

Feature or enhancement

I believe the next step toward untangling ctypes should be switching cfield.c to be based on fixed-width integer types.
This should be a pure refactoring, without user-visible behaviour changes.

Currently, we use traditional native C types, usually identified by struct format characters when a short (and identifier-friendly) name is needed:

  • signed char (b) / unsigned char (B)
  • short (h) / unsigned short (h)
  • int (i) / unsigned int (i)
  • long (l) / unsigned long (l)
  • long long (q) / unsigned long long (q)

These map to C99 fixed-width types, which i propose switching to:

  • int8_t/uint8_t
  • int16_t/uint16_t
  • int32_t/uint32_t
  • int64_t/uint64_t

The C standard doesn't guatrantee that the “traditional” types must map to the fixints.
But, ctypes currently requires it, so the assuption won't break anything.

By “map” I mean that the size of the types matches. The alignment requirements might not.
This needs to be kept in mind but is not an issue in ctypes accessors, which explicitly handle unaligned memory for the integer types.

Note that there are 5 “traditional” C type sizes, but 4 fixed-width ones. Two of the former are functionally identical to one another; which ones they are is platform-specific (e.g. int==long==int32_t.)
This means that one of the current implementations is redundant on any given platform.

The fixint types are parametrized by the number of bytes/bits, and one bit for signedness. This makes it easier to autogenerate code for them or to write generic macros (though generic API like PyLong_AsNativeBytes is problematic for performance reasons -- especially compared to a memcpy with compile-time-constant size).

When one has a different integer type, determining the corresponding fixint means a sizeof and signedness check. This is easier and more robust than the current implementations (see wchar_t or _Bool).

The refactoring can pave the way for:

  • Separating bitfield accessors, so bitfield logic doesn't slow down normal access. (This can be done today, but would mean another set of nearly-identical hand-written functions, which is hard to maintain, let alone experiment with. We need more metaprogramming.)
  • Integer types with arbitrary size & alignment (useful for getting the __int128 type, or matching other platforms). This would be a new future feature.

Linked PRs

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions