Fix over synchronization of epoll #3946

FrankReh · 2024-10-06T14:23:10Z

The clock used by epoll is now per event generated, rather than by the epoll's ready_list.

The same epoll tests that existed before are unchanged and still pass. Also the tokio test case we had worked on last week still passes with this change.

This change does beg the question of how the epoll event states should change. Perhaps rather than expose public crate bool fields, so setters should be provided that include a clock parameter or an optional clock parameter. Also should all the epoll event possibilities have their clock sync tested the way these commit lay out testing. In this first go around, only the pipe's EPOLLIN is tested. The EPOLLOUT might deserve testing too, as would the eventfd. Any future source of epoll events would also fit into that category.

FrankReh · 2024-10-06T14:26:02Z

This left as a draft for at least two reasons. There is an open question why the fix doesn't work for the test case as it was outlined in the issue. That open question is explained in the test cases.

Then there are some field names that can be simplified, to conform to existing names, but while working on this file, I prefer having field names that are unique so it is easy to move around the file based on where those fields are used.

I guess a third reason. Before final review, the open test question will have been resolved and the tests will likely be reduced.

src/shims/unix/unnamed_socket.rs

FrankReh · 2024-10-06T17:12:20Z

I wonder if the problem I’m seeing is cause by my assumption that the source thread should have its clock incremented just once. The one time is to be part of the happens-before relationship between this thread and any thread receiving the event. But to create a happens after clock, the clock could simply be incremented one more time as part of the cleanup to creating one or more epoll events. Then my tests would pass as expected and the question I had is solved.

FrankReh · 2024-10-06T22:51:26Z

I have found a way to fix the problem I was seeing. It involves incrementing the clock twice when the pipe's buffer is being written to. Once to prepare the clock for the happens-before relationship. And once again to represent a happens-after relationship for writes by the same thread that happen after this write. If the write to the global static also updated the clock on its own, this probably wouldn't be necessary.

This begs new questions.

Should the pipe's read side also be epoll-able? Right now there is no such event created.

And does the eventfd, which also supports epoll, suffer from this problem of syncing clocks through epoll but then not advancing the clock, thereby keeping UB with static global writes afterwards from being detected?

RalfJung · 2024-10-07T06:21:14Z

Should the pipe's read side also be epoll-able? Right now there is no such event created.

Both ends are already epoll-able, I think.

src/shims/unix/unnamed_socket.rs

RalfJung · 2024-10-07T21:10:33Z

Let's fix #3947 (in a different PR) before attempting this.

bors · 2024-10-08T13:47:18Z

☔ The latest upstream changes (presumably #3951) made this pull request unmergeable. Please resolve the merge conflicts.

FrankReh · 2024-10-09T02:43:10Z

The clock field could also be an Option. Maybe less work when clocks aren't enabled.

RalfJung · 2024-10-09T05:44:42Z

Default clocks are trivial, so no that shouldn't be necessary, and I don't think it helps the code either.

RalfJung · 2024-10-09T05:45:07Z

Not sure what the status here is, so
@rustbot author

FrankReh · 2024-10-09T10:42:48Z

Not sure what the status here is, so
@rustbot author

I'm not familiar with this step. Am I the author being asked for the status or is this for a maintainer?
It's marked ready for review. It has a test for before and a test for after showing the improvement. Is there something else for me to do?

RalfJung · 2024-10-09T11:15:11Z

Sorry, I should have explained - If it is ready for review, you are expected to do @rustbot ready

src/shims/unix/linux/epoll.rs

tests/fail-dep/libc/libc-epoll-clock-sync.rs

RalfJung · 2024-10-09T12:13:10Z

@rustbot author

tests/fail-dep/libc/libc-epoll-data-race.rs

src/shims/unix/linux/epoll.rs

FrankReh · 2024-10-09T16:31:03Z

@rustbot ready

RalfJung · 2024-10-09T16:41:57Z

Looks great, thanks!

@bors r+

bors · 2024-10-09T16:42:00Z

📌 Commit 4e4af8d has been approved by RalfJung

It is now in the queue for this repository.

bors · 2024-10-09T16:43:08Z

⌛ Testing commit 4e4af8d with merge e0bd116...

bors · 2024-10-09T17:08:28Z

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing e0bd116 to master...

tiif reviewed Oct 6, 2024

View reviewed changes

src/shims/unix/unnamed_socket.rs Outdated Show resolved Hide resolved

RalfJung reviewed Oct 7, 2024

View reviewed changes

src/shims/unix/unnamed_socket.rs Outdated Show resolved Hide resolved

RalfJung reviewed Oct 7, 2024

View reviewed changes

src/shims/unix/unnamed_socket.rs Outdated Show resolved Hide resolved

RalfJung added the S-blocked Status: blocked on something happening somewhere else label Oct 7, 2024

epoll: test case showing too much clock sync

b97232b

FrankReh force-pushed the fix-over-synchronization-of-epoll branch from 5c64408 to 4a942d7 Compare October 9, 2024 01:50

FrankReh marked this pull request as ready for review October 9, 2024 02:41

RalfJung removed the S-blocked Status: blocked on something happening somewhere else label Oct 9, 2024

rustbot added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Oct 9, 2024

rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Oct 9, 2024

RalfJung reviewed Oct 9, 2024

View reviewed changes

rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Oct 9, 2024

FrankReh force-pushed the fix-over-synchronization-of-epoll branch from 0d69a12 to 927a2ae Compare October 9, 2024 14:02

RalfJung reviewed Oct 9, 2024

View reviewed changes

tests/fail-dep/libc/libc-epoll-data-race.rs Outdated Show resolved Hide resolved

RalfJung reviewed Oct 9, 2024

View reviewed changes

src/shims/unix/linux/epoll.rs Outdated Show resolved Hide resolved

epoll: change clock to be per event

4e4af8d

FrankReh force-pushed the fix-over-synchronization-of-epoll branch from b825e81 to 4e4af8d Compare October 9, 2024 16:26

rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Oct 9, 2024

bors merged commit e0bd116 into rust-lang:master Oct 9, 2024
8 checks passed

FrankReh deleted the fix-over-synchronization-of-epoll branch October 9, 2024 19:28

Fix over synchronization of epoll #3946

Fix over synchronization of epoll #3946

Uh oh!

Conversation

FrankReh commented Oct 6, 2024

Uh oh!

FrankReh commented Oct 6, 2024

Uh oh!

Uh oh!

FrankReh commented Oct 6, 2024

Uh oh!

FrankReh commented Oct 6, 2024

Uh oh!

RalfJung commented Oct 7, 2024

Uh oh!

Uh oh!

Uh oh!

RalfJung commented Oct 7, 2024

Uh oh!

bors commented Oct 8, 2024

Uh oh!

FrankReh commented Oct 9, 2024

Uh oh!

RalfJung commented Oct 9, 2024

Uh oh!

RalfJung commented Oct 9, 2024

Uh oh!

FrankReh commented Oct 9, 2024

Uh oh!

RalfJung commented Oct 9, 2024 via email

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RalfJung commented Oct 9, 2024

Uh oh!

Uh oh!

Uh oh!

FrankReh commented Oct 9, 2024

Uh oh!

RalfJung commented Oct 9, 2024

Uh oh!

bors commented Oct 9, 2024

Uh oh!

bors commented Oct 9, 2024

Uh oh!

bors commented Oct 9, 2024

Uh oh!

Uh oh!

Uh oh!