fix: data race on client/server shutdown #1228

tyler92 · 2024-08-18T17:45:57Z

There is a data race with quite simple scenario:

int main()
{
    const Pistache::Address address("localhost", Pistache::Port(0));

    Pistache::Http::Endpoint server(address);
    Pistache::Rest::Router router;
    server.setHandler(router.handler());

    auto flags       = Pistache::Tcp::Options::ReuseAddr;
    auto server_opts = Pistache::Http::Endpoint::options().flags(flags).threads(1);
    server.init(server_opts);
    server.serveThreaded();

    Pistache::TcpClient client;
    client.connect(Pistache::Address("localhost", server.getPort()));
    client.send(input);

    server.shutdown();
    return 0;
}

Address sanitizer report: asan.txt
As far as I understood it's not legal to modify SyncImpl::handlers_ before SyncImpl::run is finished. So the fix joins worker threads before SyncImpl::handlers_ invalidation.

I'm not sure the fix is correct from the original design point of view, but on my local machine, ASAN is happy with the provided example.
I think the following issues are related: #842 #1018 #539

kiplingw · 2024-08-18T18:14:48Z

Hey @tyler92. Thanks for the PR. Question for you. Isn't serveThreaded() intended to not block by design?

tyler92 · 2024-08-18T18:54:59Z

Hey @tyler92. Thanks for the PR. Question for you. Isn't serveThreaded() intended to not block by design?

Hi. Yes, you are right. So in my example, I launch the server with serveThreaded, make a request, and shut down the server immediately. This scenario is used for unit tests, fuzzing testing, etc. The problem is that there is a data race and if these tests are running with address/thread sanitizer, they will fail due to sanitizer warning.

kiplingw · 2024-08-18T19:02:43Z

Might the solution then be to change the unit tests to use the blocking version instead?

tyler92 · 2024-08-18T19:06:14Z

As a workaround - yes, it's possible.

But we still have a problem with the current solution: how to stop the server after serveThreaded without a data race?

kiplingw · 2024-08-18T19:09:14Z

But isn't that the expected behaviour though? If the user requested to start the server asynchronously?

tyler92 · 2024-08-18T19:13:24Z

If a user called serveThreaded - of course the expected behavior is that the server started asynchronously.

But even in this scenario the user should be able to stop and destroy the server correctly, without data races and other memory issues. Right now it's now possible as far as I could see

kiplingw · 2024-08-18T19:14:50Z

Ok, that makes sense. I'll wait to hear @Tachi107 and @dgreatwood's thoughts.

tyler92 · 2024-08-18T19:17:14Z

OK. Just for clarification - I don't insist on my solution, probably there is a more correct way to achieve the same result

dgreatwood · 2024-08-18T22:56:05Z

I’ll try and take a look in next few days.

…

On Sun, Aug 18, 2024 at 12:17 PM tyler92 ***@***.***> wrote: OK. Just for clarification - I don't insist on my solution, probably there is a more correct way to achieve the same result — Reply to this email directly, view it on GitHub <#1228 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFMA2ZLXFFCJIHWYKBDEHTZSDXM7AVCNFSM6AAAAABMWROXDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGM3DEMRZGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- NOTICE: This email and its attachments may contain privileged and confidential information, only for the viewing and use of the intended recipient. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, acting upon, or use of the information contained in this email and its attachments is strictly prohibited and that this email and its attachments must be immediately returned to the sender and deleted from your system. If you received this email erroneously, please notify the sender immediately. Xage Security, Inc. and its affiliates will never request personal information (e.g., passwords, Social Security numbers) via email. Report suspicious emails to ***@***.*** ***@***.***>

dgreatwood · 2024-08-20T23:38:11Z

Hi @tyler92 <https://github.com/tyler92>, Thanks for raising this. I had a look at the code in reactor.cc. I was wondering if there is a somewhat more general issue, namely that "handlers" (which is a HandlerList) in SyncImpl is not protected in the event of multithreaded access. We do already have a mutex poller.reg_unreg_mutex_ which protects against a handler getting deregistered while in use. But this is more like protection of individual handlers, rather than of the HandlerList as a whole. Accordingly, as an experiment I have created a branch HandlerListMultiThread - I've created a PR <#1229> for it so you can see it easily. In the HandlerListMultiThread branch, there is a mutex handlers_arr_mutex_ private within HandlerList, which is used to guard (aka "lock") the std::array of handlers when the array is accessed within HandlerList's methods. Looking at your asan.txt: - In thread T148321, which does the access-after-free, the HandlerList::forEachHandler function is in use, called out of SyncImpl::run() - In thread T0, the accessed address was previously freed by a call to HandlerList::removeAll In this new HandlerListMultiThread branch code, HandlerList::forEachHandler and removeAll both have to acquire the mutex HandlerList::handlers_arr_mutex_. I believe then that the new mutex should stop the "removeAll" from stamping on the list before SyncImpl::run has finished with it. **** Could you please try out this new branch* with your test scenario? (Without your "join" code, of course). https://github.com/dgreatwood/pistachefork/tree/HandlerListMultiThread (and see #1229) Thanks once more! Duncan P.S. The above is concerned just with changes to reactor.cc. I haven't looked at the other proposed changes in tests/tcp_client.h, though they look logical at first glance. On Sun, Aug 18, 2024 at 3:55 PM Duncan Greatwood ***@***.***> wrote:

…

I’ll try and take a look in next few days. On Sun, Aug 18, 2024 at 12:17 PM tyler92 ***@***.***> wrote: > OK. Just for clarification - I don't insist on my solution, probably > there is a more correct way to achieve the same result > > — > Reply to this email directly, view it on GitHub > <#1228 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAFMA2ZLXFFCJIHWYKBDEHTZSDXM7AVCNFSM6AAAAABMWROXDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGM3DEMRZGQ> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

-- NOTICE: This email and its attachments may contain privileged and confidential information, only for the viewing and use of the intended recipient. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, acting upon, or use of the information contained in this email and its attachments is strictly prohibited and that this email and its attachments must be immediately returned to the sender and deleted from your system. If you received this email erroneously, please notify the sender immediately. Xage Security, Inc. and its affiliates will never request personal information (e.g., passwords, Social Security numbers) via email. Report suspicious emails to ***@***.*** ***@***.***>

tyler92 · 2024-08-21T19:09:24Z

#1229 (comment)

I haven't looked at the other proposed changes in tests/tcp_client.h, though they look logical at first glance.

Yes, it's a "bonus" change that fixes FD leak, but I can PR it separately

fix: data race on client/server shutdown

2280528

kiplingw requested review from dgreatwood, kiplingw and Tachi107 August 18, 2024 18:13

kiplingw added bug fix in progress security labels Aug 18, 2024

dgreatwood mentioned this pull request Aug 30, 2024

Fix: In Reactor Add A Mutex To Protect HandlerList Internal Array #1229

Merged

tyler92 closed this Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: data race on client/server shutdown #1228

fix: data race on client/server shutdown #1228

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

dgreatwood commented Aug 18, 2024 via email

dgreatwood commented Aug 20, 2024 via email

tyler92 commented Aug 21, 2024

fix: data race on client/server shutdown #1228

fix: data race on client/server shutdown #1228

Conversation

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

kiplingw commented Aug 18, 2024

tyler92 commented Aug 18, 2024

dgreatwood commented Aug 18, 2024 via email

dgreatwood commented Aug 20, 2024 via email

tyler92 commented Aug 21, 2024