Description
This is a plan for hammer out the semantics for bit packed types in the Zig language.
Let's start with a tiny syntactical change:
(1) Rename packed
to bitpacked
This change will reflect the true meaning of both packed struct
and packed union
more than packed
and reduces confusion about the type of packing that is performed (as other languages use packed
for byte packing).
This should be trivial to implement and update via zig fmt
.
(2) Improve bitpacked union
semantics
Right now, the semantics of packed unions are kinda … underdefined:
A packed union has well-defined in-memory layout and is eligible to be in a packed struct.
Let's change that definition to:
A
bitpacked union
has a well-defined in-memory layout and, similar to abitpacked struct
, is backed by a single integer of the size of the largest union member.
bitpacked union(T)
can be used to explicitly specify the size of the union.All union members are aligned such that the LSB of all types align. This means that
const T = packed union(u32) { a: u32, b: u8, }; var t: T = .{ .a = 1 }; std.debug.assert(t.b == 1);holds true for both big and little endian systems.
This also means that not all union members share the same address, which is true for extern union
.
(3) No changes to bitpacked struct
Except for the rename of packed
to bitpacked
, no changes are done here.
(4) Introduce a new type opaquebits(N)
This new type opaquebits(N)
is an opaque bit bag that has no semantics defined except for "it takes N bits of space".
This type can be used for storing fully opaque data and the only way to interpret its content is by using @bitCast
from and to it.
(5) Introduce a new type class std.builtin.Type.BitPacked
(This is definitly a stretch goal, and might require further work)
This type can be the backing type for both bitpacked union
and bitpacked struct
, which is similar to a regular struct/union type in that it has typed fields, but lacks most other properties.
The BitPacked
type is a mix of structure and union and inspired by C# StructLayout.Explicit
and FieldOffset
, which allow construction of both structures and unions of an arbitrary layout.
This means that it's also legal to allow overlapping fields. For a union type, all bit_offset
values are 0, for a structure, all bit_offset
fields are layed out such that all fields are back-to-back.
Each field in a BitPacked
type has an explicit bit offset and backing type:
pub const BitPackedField = struct {
name: [:0]const u8,
type: type,
bit_offset: u16,
default_value: ?*const anyopaque,
};
pub const BitPacked = struct {
backing_integer: type,
fields: []const BitPackedField,
decls: []const Declaration,
};
It is not legal to create a type with unnamed bits.
Each BitPacked
type behaves in codegen and on ABI boundaries as if it would be BitPacked.backing_integer
.
(6) Incorporate #19395
Use bitpacked(T) struct
/bitpacked(T) union
instead of bitpacked struct(T)
bitpacked union
as those are more uniform in what the (T)
actually modifies.
Use Case
One example where i'd like to have seen some of these features would be the EDID Video Input Definition:
const VideoInputDefinition = bitpacked(u8) struct
{
config: bitpacked(u7) union {
analog: bitpacked(u7) struct {
serrations: Support,
composite_sync_on_green: Support,
composite_sync_on_horiz: Support,
separate_sync: Support,
setup: enum(u1) {
blank_is_black = 0,
pedestal = 1,
},
signal_level: enum(u2) {
model0 = 0b000, // 0.700 : 0.300 : 1.000 V p-p
model1 = 0b001, // 0.714 : 0.286 : 1.000 V p-p
model2 = 0b010, // 1.000 : 0.400 : 1.400 V p-p
model3 = 0b011, // 0.700 : 0.000 : 0.700 V p-p
},
},
digital: bitpacked(u7) struct {
standard: enum(u4) {
none = 0,
dvi = 1,
hdmi_a = 2,
hdmi_b = 3,
mddi = 4,
display_port = 5,
},
color_depth: enum(u3) {
undefined = 0b000, // Color Bit Depth is undefined
color6 = 0b001, // 6 Bits per Primary Color
color8 = 0b010, // 8 Bits per Primary Color
color10 = 0b011, // 10 Bits per Primary Color
color12 = 0b100, // 12 Bits per Primary Color
color14 = 0b101, // 14 Bits per Primary Color
color16 = 0b110, // 16 Bits per Primary Color
},
},
},
type: enum(u1) { analog=0, digital=1 };
};
For additional type safety, config
could've been done as opaquebits(7)
and a fn unpack() union(enum){…}
function could be done that returns either/or contained structure definition.
Some other structures in EDID contain even more nested bit field definitions on a singular byte or integer, so a more improved support for bitpacked data can help for sure.
Summary
For me, (1) to (4) are things that are definitly useful and would improve language semantics. (5) is definitly more of a stretch goal.