Can "item" and "nd_item" be shared across work-items? #532

gmlueck · 2024-01-18T16:06:07Z

This question recently came up from our implementation team. What is the precise semantics of the member functions on item and nd_item? Do they return the index of the calling work-item or do they return the index of the work-item that created the object?

The question is relevant only in the weird case where an item or nd_item object from one work-item is copied into SLM and then referenced by another work-item. This could be done, for example, by using placement-new to copy the item or nd_item object.

Note that id may have a different semantic because it is user-constructible. The semantics of id seem clear. The member functions return the information that was used to construct the id object. Therefore, if work-item A copies its id object to SLM and then another work-item B calls its member functions, I presume work-item B would get the index of work-item A.

The text was updated successfully, but these errors were encountered:

AlexeySachkov · 2024-01-18T16:12:53Z

Linking #451 here. sub_group is probably different enough to be handled differently compared to item and nd_item, but we had the same question about it as well.

illuhad · 2024-01-18T16:30:10Z

I don't think these classes are meaningful outside of their own work item, because they generally carry the semantics of "give me the position of this item in the parallel iteration space". Using objects from a different work item should be illegal and/or UB.

If one did this, even just within one implementation the behavior might not be consistent. For example, whether AdaptiveCpp stores the position in the iteration space inside nd_item/item or whether it queries this on demand using builtins changes depending on compilation flow and configuration.

gmlueck · 2024-01-18T23:21:07Z

I think this makes sense also. I was wondering why we didn't define all the member functions of nd_item and item as static. That would make it clear that they only return information about the calling work-item, since static member functions don't have access to any state in the object. Did we decide not to define them as static to give implementations the freedom to store information about the calling work-item in the object?

In any case, I think we could add text to the introduction for each of these objects something like:

The implementation constructs an item object for each work-item when a kernel is submitted with a range iteration space. That object (and copies of that object) may be used only by the work-item that "owns" it. Although it is possible to copy the object into memory that is accessible to other work-items or to the host, calls to member functions of the object from other work-items or from the host produce undefined behavior.

I think we could use similar language for sub_group and group also.

Pennycook · 2024-01-19T15:04:59Z

I think this makes sense also. I was wondering why we didn't define all the member functions of nd_item and item as static. That would make it clear that they only return information about the calling work-item, since static member functions don't have access to any state in the object. Did we decide not to define them as static to give implementations the freedom to store information about the calling work-item in the object?

I agree that freedom of implementation is important. But I think it may also be important to note that if the functions were static, a developer wouldn't need an instance of an nd_item to call them. That would mean that developers could call nd_item member functions outside of kernels, call an nd_item member function using a class with the wrong number of dimensions, etc.

tomdeakin · 2024-01-25T17:09:30Z

Discussed in WG meeting 01/25/24.
Next steps: introduce a new section similar to https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:reference-semantics to group these restrictions together. Or clarify these directly. Clarification to SYCL 2020.

This change allows stateless nd_item implementation as discussed in KhronosGroup/SYCL-Docs#532. The list of changes includes removal of checks which compare nd_teams shared between work-items or on the host. Now test checks equality only on device within single work-item.

The problem with removed code is that it compares nd_item "states" obtained by calling methods like get_local_id() on the host. The discussion in KhronosGroup/SYCL-Docs#532 clarfies that these methods are supposed to return meaningful values from the kernel code only. nd_item constructors are tested in device code, so this change do not reduce feature coverage.

bader · 2024-02-17T01:58:45Z

FYI. I committed stateless implementation of nd_item to DPC++ compiler. This commit exposed SYCL-CTS tests relying on stateful implementation or wrong uses of nd_item methods on the host.

nd_item_equality test validates one of the rules for "by-value semantic" - https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:byval-semantics. The test allocates a buffer to store 2 nd_item objects, run a kernel which stores two nd_items from different work-items and finally uses comparison operators to check nd_items to check following rule:

T must be equality comparable on the host application (in the case where T is available on the host) and within SYCL kernel functions. Equality between two instances of T (i.e. a == b) must be true if the value of all members are equal and non-equality between two instances of T (i.e. a != b) must be true if the value of any members are not equal, unless either instance has become invalidated by a move operation. By extension of the requirements above, equality on T must guarantee to be reflexive (i.e. a == a), symmetric (i.e. a == b implies b == a and a != b implies b != a) and transitive (i.e. a == b && b == c implies c == a).

As it was agreed above sharing and comparing stateless nd_item objects doesn't make sense and I'd like to drop this test.

nd_item_constructors compares different nd_item objects using more reliable technique. It saves the "state" of nd_item object by obtaining and saving ranges/ids via calling methods like get_local_id(). Unfortunately, it does it on the host too!

KhronosGroup/SYCL-CTS#869 updates both tests.

gmlueck mentioned this issue Jan 19, 2024

[SYCL] Make nd_item stateless intel/llvm#12236

Merged

bader mentioned this issue Feb 17, 2024

Update nd_item constructors and equality tests. KhronosGroup/SYCL-CTS#869

Merged

Pennycook mentioned this issue Sep 30, 2024

Unify group code snippet with other non-user-constructible classes #627

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can "item" and "nd_item" be shared across work-items? #532

Can "item" and "nd_item" be shared across work-items? #532

gmlueck commented Jan 18, 2024

AlexeySachkov commented Jan 18, 2024

illuhad commented Jan 18, 2024 •

edited

Loading

gmlueck commented Jan 18, 2024

Pennycook commented Jan 19, 2024

tomdeakin commented Jan 25, 2024

bader commented Feb 17, 2024

Can "item" and "nd_item" be shared across work-items? #532

Can "item" and "nd_item" be shared across work-items? #532

Comments

gmlueck commented Jan 18, 2024

AlexeySachkov commented Jan 18, 2024

illuhad commented Jan 18, 2024 • edited Loading

gmlueck commented Jan 18, 2024

Pennycook commented Jan 19, 2024

tomdeakin commented Jan 25, 2024

bader commented Feb 17, 2024

illuhad commented Jan 18, 2024 •

edited

Loading