Skip to content

Conversation

@spencer-lunarg
Copy link
Contributor

Actually closes #4058

cc @arcady-lunarg

280: 6(ptr) CompositeExtract 277 0
282: 281(ptr) AccessChain 279 35
Store 282 280 Aligned 16
Store 282 280 Aligned 8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is

layout(buffer_reference) buffer MyBuffer;

struct S6 { MyBuffer x; };

layout(set = 0, binding = 1) restrict readonly buffer SSBO {
	S6 s6;
} InBuffer;

layout(buffer_reference) buffer MyBuffer {
	S6 s6;
};

void main() {
	MyBuffer payload;
	payload.s6 = InBuffer.s6;
}

which is storing a BDA pointer, but it was defaulting back to 16 before, so this is correct

Copy link

@maxime-modulopi maxime-modulopi Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be 16 I believe. The default value for buffer_reference_align when unspecified is 16 and the first member of the implicit buffer struct can be proved to have the same alignment as the struct itself.
Other members may have smaller alignments depending on their offset in the struct (specified by the selected layout rule) though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the correct formula to compute the highest guaranteed alignment for each member of a struct is gcd(struct_alignment, member_offset_in_the_struct) where gcd is the greatest common divisor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignments have to be powers of two so you don't need gcd, you can just use max.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean max(struct_alignment, member_offset_in_the_struct)?
This does not work:

layout(buffer_reference, scalar) buffer Data
{
	uint64_t data0; // offset 0
	uint64_t data1; // offset 8
	uint64_t data2; // offset 16
	uint64_t data3; // offset 24
};

Correct:
gcd(16, 0) = 16
gcd(16, 8) = 8
gcd(16, 16) = 16
gcd(16, 24) = 8

Incorrect:
max(16, 0) = 16
max(16, 8) = 16
max(16, 16) = 16
max(16, 24) = 24

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, looked back into this, the issue is actually not the logic I added, it is doing things correctly IMO

The "real" issue here is this GLSL is generating code to copy the pointer, not the struct, but only because this test is --target-env vulkan1.1

If you set it to --target-env vulkan1.2 or higher, it will generate the correct SPIR-V such that the alignment is 8 as you would think

https://godbolt.org/z/d7n86soYd

So I agree there is an issue, but only for 1.1 or lower GLSL code path and its because it produces code that is copying a pointer, which should be aligned 8... trying to say "this PR is doing the logic correctly and there is nothing in the scope here to fix, but can raise a new issue to track the issue"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer being 8 bytes, it indeed requires an alignment of at least 8 bytes.
However, OpLoad instructions should always use the largest alignment possible that can be compile-time-proved, for performance reasons (see Nvidia's Advanced API Performance: Descriptors blog).
From my understanding, this is the whole point of the existence of buffer_reference_align instead of auto-computing the alignment from the largest member.

In this example, MyBuffer does not set buffer_reference_align, which means the default value of 16 is used, as explained by GL_EXT_buffer_reference:

Each buffer reference type has an alignment that can be specified via
the "buffer_reference_align" layout qualifier. This must be a power of
two and be greater than or equal to the largest scalar/component type
in the block. If the layout qualifier is not specified, it defaults to
16 bytes.

Since the first member, s6, is at offset 0, we can prove that it has the same guaranteed alignment as the implicit struct (16) and thus should have loads/stores aligned to 16 bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thus should have loads/stores aligned to 16 bytes.

I agree 100% with you that it should be 16

I disagree there is anything to change in this PR, this PR is exposing a bug that is there already, which is --target-env vulkan1.1 is not producing correct BDA spirv, regardless of the Aligned value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open another issue about buffer_reference_align being ignored for loads/stores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxime-modulopi thanks, clearly there are still more things to get right, but happy to get it fixed little by little and issues with small code to reproduce is always helpful!

Tracks if OpStore is storing a buffer_reference pointer and if
so, sets the alignment to be 8 like it does currently for loads
@spencer-lunarg spencer-lunarg force-pushed the spencer-lunarg-one-day-we-will-have-bda-working branch from 2e8228d to 7cd236f Compare December 11, 2025 21:00
@maxime-modulopi
Copy link

maxime-modulopi commented Dec 18, 2025

I've compiled the branch and while it now works for standard variables, assigning entire arrays of pointers still align the load and store to 4.

Reproduction code
#version 460 core

#extension GL_EXT_buffer_reference: require
#extension GL_EXT_scalar_block_layout: require

layout (buffer_reference, buffer_reference_align = 4, scalar) readonly buffer DataBuffer
{
	vec4 data;
};

layout (buffer_reference, buffer_reference_align = 8, scalar) buffer ParametersBuffer
{
	DataBuffer src[2];
	DataBuffer dst[2];
};

layout(push_constant) uniform constants
{
	ParametersBuffer u_param;
};

layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
void main()
{
	u_param.dst = u_param.src;
}

@maxime-modulopi
Copy link

maxime-modulopi commented Dec 18, 2025

Also, there is still this weird behavior where changing DataBuffer's buffer_reference_align from 4 to 8 fixes the load/store alignments of pointer variables in the ParametersBuffer block.
Looks like we are taking the wrong alignment when traversing the buffer reference chain?

This may be the root of the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generated OpLoad does not respect the minimum type alignment

3 participants