Skip to content

Improve packed semantics #19660

Open
Open
@ikskuh

Description

@ikskuh

This is a plan for hammer out the semantics for bit packed types in the Zig language.

Let's start with a tiny syntactical change:

(1) Rename packed to bitpacked

This change will reflect the true meaning of both packed struct and packed union more than packed and reduces confusion about the type of packing that is performed (as other languages use packed for byte packing).

This should be trivial to implement and update via zig fmt.

(2) Improve bitpacked union semantics

Right now, the semantics of packed unions are kinda … underdefined:

A packed union has well-defined in-memory layout and is eligible to be in a packed struct.

Let's change that definition to:

A bitpacked union has a well-defined in-memory layout and, similar to a bitpacked struct, is backed by a single integer of the size of the largest union member.
bitpacked union(T) can be used to explicitly specify the size of the union.

All union members are aligned such that the LSB of all types align. This means that

const T = packed union(u32) {
    a: u32,
    b: u8,
};
var t: T = .{ .a = 1 };
std.debug.assert(t.b == 1);

holds true for both big and little endian systems.

This also means that not all union members share the same address, which is true for extern union.

(3) No changes to bitpacked struct

Except for the rename of packed to bitpacked, no changes are done here.

(4) Introduce a new type opaquebits(N)

This new type opaquebits(N) is an opaque bit bag that has no semantics defined except for "it takes N bits of space".

This type can be used for storing fully opaque data and the only way to interpret its content is by using @bitCast from and to it.

(5) Introduce a new type class std.builtin.Type.BitPacked

(This is definitly a stretch goal, and might require further work)

This type can be the backing type for both bitpacked union and bitpacked struct, which is similar to a regular struct/union type in that it has typed fields, but lacks most other properties.

The BitPacked type is a mix of structure and union and inspired by C# StructLayout.Explicit and FieldOffset, which allow construction of both structures and unions of an arbitrary layout.

This means that it's also legal to allow overlapping fields. For a union type, all bit_offset values are 0, for a structure, all bit_offset fields are layed out such that all fields are back-to-back.

Each field in a BitPacked type has an explicit bit offset and backing type:

pub const BitPackedField = struct {
    name: [:0]const u8,
    type: type,
    bit_offset: u16,
    default_value: ?*const anyopaque,
};

pub const BitPacked = struct {
    backing_integer: type,
    fields: []const BitPackedField,
    decls: []const Declaration,
};

It is not legal to create a type with unnamed bits.

Each BitPacked type behaves in codegen and on ABI boundaries as if it would be BitPacked.backing_integer.

(6) Incorporate #19395

Use bitpacked(T) struct/bitpacked(T) union instead of bitpacked struct(T) bitpacked union as those are more uniform in what the (T) actually modifies.

Use Case

One example where i'd like to have seen some of these features would be the EDID Video Input Definition:

Video Input Definition Layout

const VideoInputDefinition = bitpacked(u8) struct
{
    config: bitpacked(u7) union {
        analog: bitpacked(u7) struct {
            serrations: Support,
            composite_sync_on_green: Support,
            composite_sync_on_horiz: Support,
            separate_sync: Support,
            setup: enum(u1) {
                blank_is_black = 0,
                pedestal = 1,
            },
            signal_level: enum(u2) {
                model0 = 0b000, //  0.700 : 0.300 : 1.000 V p-p
                model1 = 0b001, //  0.714 : 0.286 : 1.000 V p-p
                model2 = 0b010, //  1.000 : 0.400 : 1.400 V p-p
                model3 = 0b011, //  0.700 : 0.000 : 0.700 V p-p
            },
        },
        digital: bitpacked(u7) struct  {
            standard: enum(u4) {
                none = 0,
                dvi = 1,
                hdmi_a = 2,
                hdmi_b = 3,
                mddi = 4,
                display_port = 5,
            },
            color_depth: enum(u3) {
                undefined = 0b000, // Color Bit Depth is undefined
                color6 = 0b001, // 6 Bits per Primary Color
                color8 = 0b010, // 8 Bits per Primary Color
                color10 = 0b011, // 10 Bits per Primary Color
                color12 = 0b100, // 12 Bits per Primary Color
                color14 = 0b101, // 14 Bits per Primary Color
                color16 = 0b110, // 16 Bits per Primary Color
            },
        },
    },
    type: enum(u1) { analog=0, digital=1 };
};

For additional type safety, config could've been done as opaquebits(7) and a fn unpack() union(enum){…} function could be done that returns either/or contained structure definition.

Some other structures in EDID contain even more nested bit field definitions on a singular byte or integer, so a more improved support for bitpacked data can help for sure.

Summary

For me, (1) to (4) are things that are definitly useful and would improve language semantics. (5) is definitly more of a stretch goal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    frontendTokenization, parsing, AstGen, Sema, and Liveness.proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions