Skip to content

Conversation

@ashgti
Copy link
Contributor

@ashgti ashgti commented May 12, 2025

A number of unit tests are unstable at the moment and I believe this is due to event handling between the event thread and the DAP::Loop.

One way this manifests is the 'disconnect' request terminating the SBTarget while the event handler is still handling module events. This is causing a SIGPIPE between the debugserver and the lldb-dap process.

I have some additional follow up patches to address test event synchronization, since many tests seem to be under specified in terms of their expected state that I think is contributing to these kinds of races.

A number of unit tests are unstable at the moment and I believe this is due to event handling between the event thread and the DAP::Loop.

One way this manifests is the 'disconnect' request terminating the SBTarget while the event handler is still handling module events. This is causing a SIGPIPE between the debugserver and the lldb-dap process.

I have some additional follow up patches to address test event synchronization, since many tests seem to be under specified in terms of their expected state that I think is contributing to these kinds of races.
@llvmbot
Copy link
Member

llvmbot commented May 12, 2025

@llvm/pr-subscribers-lldb

Author: John Harrison (ashgti)

Changes

A number of unit tests are unstable at the moment and I believe this is due to event handling between the event thread and the DAP::Loop.

One way this manifests is the 'disconnect' request terminating the SBTarget while the event handler is still handling module events. This is causing a SIGPIPE between the debugserver and the lldb-dap process.

I have some additional follow up patches to address test event synchronization, since many tests seem to be under specified in terms of their expected state that I think is contributing to these kinds of races.


Full diff: https://github.com/llvm/llvm-project/pull/139596.diff

1 Files Affected:

  • (modified) lldb/tools/lldb-dap/DAP.cpp (+5)
diff --git a/lldb/tools/lldb-dap/DAP.cpp b/lldb/tools/lldb-dap/DAP.cpp
index 4feca1253be20..e84d3d9e7eed8 100644
--- a/lldb/tools/lldb-dap/DAP.cpp
+++ b/lldb/tools/lldb-dap/DAP.cpp
@@ -1504,6 +1504,11 @@ void DAP::EventThread() {
   bool done = false;
   while (!done) {
     if (listener.WaitForEvent(1, event)) {
+      // Once we get an event, make sure we finish handling it before the main
+      // thread handles the next DAP request.
+      lldb::SBMutex lock = GetAPIMutex();
+      std::lock_guard<lldb::SBMutex> guard(lock);
+
       const auto event_mask = event.GetType();
       if (lldb::SBProcess::EventIsProcessEvent(event)) {
         lldb::SBProcess process = lldb::SBProcess::GetProcessFromEvent(event);

@JDevlieghere
Copy link
Member

JDevlieghere commented May 12, 2025

I too have been contemplating this for a while...

Given that the request handlers currently lock the API mutex and now the event thread does the same, this patch means that the two are essentially fully synchronized. That begs the question: Should these be separate threads at all? We could sidestep all the synchronization issues by handling request and events on the same thread using the MainLoop. In this scenario, we'd still have the event thread, but all it would do is wait for events and enqueue handling them on the main/request thread.

@ashgti
Copy link
Contributor Author

ashgti commented May 12, 2025

I've been thinking about that as well. I wasn't sure if there was an easy way to integrate the event system into a MainLoop helper but that would definitely help.

@ashgti
Copy link
Contributor Author

ashgti commented May 12, 2025

I'll take a look at refactoring the DAP::Loop and EventThread.

@ashgti
Copy link
Contributor Author

ashgti commented May 13, 2025

Okay, I got this working with a RunLoop instead in PR #139669. All the macOS tests are passing for me, but I'll let the CI run to check linux and fix any issues with that before making the PR ready for review.

@ashgti ashgti closed this May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants