Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify sections previously using "synchronize" #484

Merged
merged 12 commits into from
Nov 10, 2023

Conversation

Pennycook
Copy link
Contributor

ISO C++ defines a "synchronizes-with" relationship between library calls, and defines functions in terms of their "synchronization" effects. Additionally, it refers to the new std::barrier and std::latch types as "coordination mechanisms" rather than "synchronization mechanisms".

Before this PR, several usages of "synchronize" (or "synchronization") were incompatible with these definitions, or at least potentially ambiguous. For example, several sections talked about "synchronizing" host and device data, but really meant that data would be copied from one place to another. Other sections talked about work-items or different devices "synchronizing with" one another, but without reference to specific library calls.

This PR rewords several sections, as follows:

  • Where "synchronize" was used to mean "wait", the specification now says that the caller "blocks" until a certain condition holds.

  • Where "synchronize" was used to mean "two copies of data are made consistent", the specification now says that data is "copied".

  • Where "synchronize" was used to mean "call a barrier", the specification now talks about "coordination" of work-items and/or devices.

Any remaining occurrences of "synchronize" are related to mutexes (which ISO C++ describes as a "synchronization primitive") or to atomics and fences (which ISO C++ describes as "synchronizing with" other atomics and fences).

Closes internal issue 629.

ISO C++ defines a "synchronizes-with" relationship between library calls, and
defines functions in terms of their "synchronization" effects. Additionally,
it refers to the new std::barrier and std::latch types as "coordination
mechanisms" rather than "synchronization mechanisms".

Before this PR, several usages of "synchronize" (or "synchronization") were
incompatible with these definitions, or at least potentially ambiguous. For
example, several sections talked about "synchronizing" host and device data,
but really meant that data would be copied from one place to another. Other
sections talked about work-items or different devices "synchronizing with" one
another, but without reference to specific library calls.

This PR rewords several sections, as follows:

- Where "synchronize" was used to mean "wait", the specification now says
  that the caller "blocks" until a certain condition holds.

- Where "synchronize" was used to mean "two copies of data are made
  consistent", the specification now says that data is "copied".

- Where "synchronize" was used to mean "call a barrier", the specification
  now talks about "coordination" of work-items and/or devices.

Any remaining occurrences of "synchronize" are related to mutexes (which ISO
C++ describes as a "synchronization primitive") or to atomics and fences
(which ISO C++ describes as "synchronizing with" other atomics and fences).

Closes internal issue 629.
@Pennycook
Copy link
Contributor Author

A quick note to reviewers: there were also a few places where "synchronize" was used in the glossary. I was surprised that there were such lengthy descriptions for these terms in the glossary, as they could very easily get out of sync with the main text. Rather than rewrite these glossary descriptions to match the updated text, I simplified them and pointed back to where the relevant behaviors are specified.

Copy link
Member

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice coordination with ISO C++. ;-)

@TApplencourt
Copy link
Contributor

For c++ spec readers like me, can he have a little definition of "coordination" somewhere in the spec (or at least a link to the C++ spec)?

Or maybe something like:

Coordination ("synchronization") between the host and any devices can be expressed in the host SYCL

@Pennycook
Copy link
Contributor Author

For c++ spec readers like me, can he have a little definition of "coordination" somewhere in the spec (or at least a link to the C++ spec)?

Or maybe something like:

Coordination ("synchronization") between the host and any devices can be expressed in the host SYCL

I don't think we should redefine coordination to mean synchronization. I'm worried that will confuse people more, because they're not exactly the same.

I think this specific usage of coordination really follows the English meaning, and just means something like "getting the host and device to work together effectively". I'm not opposed to swapping it out for a different word. The only usage of "coordinate" that I feel strongly about keeping is the description of barriers as "a coordination mechanism".

@TApplencourt
Copy link
Contributor

Oh sorry, I was not clear. I'm highly supportive of following the C++ terminology!

It's maybe just me, but reading the new Coordination section, it was still not clear what coordination meant concretely. At least synchronization, even if it's defined loosely, I understand what that means.
coordination is a CS term I was not familiar with; where synchronization, I kind of understand it may meansbarrier and co.

My suggestion was just to add a sentence to try to "ease" the reading to non-CS / C++ experts. But maybe it's not the goal of the specification to do so. I will let other people chime in.

Regardless of this, It's a nice change and improves the spec clarity which is always good!

@Pennycook
Copy link
Contributor Author

Oh sorry, I was not clear. I'm highly supportive of following the C++ terminology!

No need to apologize -- you were perfectly clear! I'm not opposed to explaining what things mean, I was just trying to point out that 'coordination ("synchronization")' might leave a reader with the impression that the terms were equivalent.

It's maybe just me, but reading the new Coordination section, it was still not clear what coordination meant concretely. At least synchronization, even if it's defined loosely, I understand what that means. coordination is a CS term I was not familiar with; where synchronization, I kind of understand it may meansbarrier and co.

You're not alone. I hadn't seen "coordination" used to describe barriers until I saw it in C++20.

My understanding of how things are defined in C++20 is that there's a sort of hierarchy of concepts. A barrier provides a way to manage/organize/coordinate a set of threads. The implementation of the barrier guarantees that all calls to a function like barrier::arrive happen before any call unblocks from barrier::wait, by employing synchronization operations. (You can see an example of this here).

I deliberately went through and tried to change every usage of "synchronization" where I wasn't 100% sure it was matching the ISO C++ definition. If we can convince ourselves that I've been too heavy-handed here, and that some functions in SYCL do provide synchronization, then I'm more than happy to revisit the text. I expected we'd have to discuss this PR for quite some time, so please keep the questions coming!

My suggestion was just to add a sentence to try to "ease" the reading to non-CS / C++ experts. But maybe it's not the goal of the specification to do so. I will let other people chime in.

Where I'd like us to end up eventually is even closer to ISO C++, so each function has its own Synchronization: section that says what synchronization operations (if any) are involved. So, the description of group_barrier would have a "Synchronization" section saying that all the work-items must arrive at the barrier before any barrier is unblocked. Only functions with synchronization effects would have such a section.

Would that help? I meant to write something about this in the PR description, but forgot.

@TApplencourt
Copy link
Contributor

Would that help? I meant to write something about this in the PR description, but forgot.

Yes! Then, perfect, we can merge this as-is and eagerly wait for the next PR :).

I guess my comment boiled down to more of a philosophical point. I would like to keep the spec readable for mere mortals.

Of course, clarify / precision / unambiguous terms are the critical things in a spec! No debate about it. But if we can avoid becoming like some other specs that are totally unreadable by non-implementers, I will appreciate it. I guess I want a "SYCL Spec for Dummies" sentences from time to time :)

adoc/chapters/architecture.adoc Outdated Show resolved Hide resolved
adoc/chapters/architecture.adoc Outdated Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
Pennycook and others added 3 commits November 9, 2023 07:27
Consistent with the rest of the sentence.

Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
Descriptions of accessors elsewhere in the specification say that functions
block "until data is available", as not all implementations of accessors are
guaranteed or required to copy data.
Copy link
Collaborator

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Just a few minor comments.

adoc/chapters/architecture.adoc Outdated Show resolved Hide resolved
adoc/chapters/architecture.adoc Outdated Show resolved Hide resolved
adoc/chapters/architecture.adoc Show resolved Hide resolved
@@ -5961,7 +5962,7 @@ property::image::use_mutex
application via a [code]#std::mutex# provided to the
property. The [code]#std::mutex# is locked by
the runtime whenever the data is in use and unlocked
otherwise. Data is synchronized with [code]#hostData#, when
otherwise. Data is copied back to [code]#hostData# when
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is worded suggests that data will always be copied back once the mutex is unlocked, but if the buffer would ordinarily wait on commands accessing that data on destruction it will also coordinate, and if the buffer would ordinarily not wait, then there will be no copy back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand how to fix this, primarily because I still don't think I understand how all the buffer destruction rules work. Could you suggest alternative wording to fix this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could rewrite that last sentence to read:

The contents of hostData are guaranteed to reflect contents of the image when the std::mutex is unlocked by the runtime.

Note that this paragraph describes the use_mutex property for images. I think there is a similar paragraph earlier in the spec that describes the use_mutex property for buffers. That should also be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made both changes in d341f85.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wording makes sense, though I think we should also clarify that this only occurs if there is a host pointer to copy to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to use use_mutex if there isn't a host pointer? I've not used it, but my understanding from reading the spec was that it's only useful to synchronize access to hostData.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, though technically it is possible to create a buffer with use_mutex and without a host pointer and mutex would be on any host allocation for the memory, you could also use set_final_data to set a host pointer.

Saying that I think it's fair to assume that it's implicit that the contents of hostData is only reflected if a host pointer is provided, so I think the wording is fine as it is.

Also add a brief paragraph to explain the connection between coordination and
synchronization.
@Pennycook
Copy link
Contributor Author

Pennycook commented Nov 9, 2023

Of course, clarify / precision / unambiguous terms are the critical things in a spec! No debate about it. But if we can avoid becoming like some other specs that are totally unreadable by non-implementers, I will appreciate it. I guess I want a "SYCL Spec for Dummies" sentences from time to time :)

After thinking about this some more, I've renamed the section to "Coordination and Synchronization", and added a paragraph that attempts to connect the two. See 76978a5.

Pennycook and others added 3 commits November 9, 2023 08:30
Co-authored-by: Gordon Brown <gordon@codeplay.com>
Co-authored-by: Gordon Brown <gordon@codeplay.com>
@gmlueck
Copy link
Contributor

gmlueck commented Nov 9, 2023

@Pennycook: we decided in the WG that we can merge this PR once @AerialMantis approves it (indicating he is happy with your resolution of his concerns). I will hold off merging #481 until this happens.

It looks like there are merged conflicts that need to be resolved, so please do that also @Pennycook.

Pennycook and others added 2 commits November 9, 2023 10:22
@gmlueck gmlueck merged commit 463a26c into KhronosGroup:SYCL-2020/master Nov 10, 2023
2 checks passed
keryell pushed a commit that referenced this pull request Sep 10, 2024
Clarify sections previously using "synchronize"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants