Skip to content

[lldb-dap] Fix TestDap_attach.py flakiness #137278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kusmour
Copy link
Contributor

@kusmour kusmour commented Apr 25, 2025

Summary

This patch makes the request_attach wait for events process and initialized just like request_launch. This ensure the DAP session can move forward somewhat correctly.

Recently TestDap_attach.test_terminate_commands became flaky.
It's hitting:

lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv
    raise ValueError(desc)
ValueError: no response for "disconnect"

I took a look at the DAP msg from that test case and noticed:

  • It's not using the regular attaching, instead it's using the attachCommands to launch debug the binary and it will stop at entry.
  • The initialized event returned after the disconnect request. Which means lldb-dap didn't really get ready yet.

NOTE

The dap_server.py is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the configurationDone request and response because it relies on a continue action to trigger the configurationDone request.

Test Plan

./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py

Also

Looks like some test cases should be re-enabled in 0b8dfb5
But only comments was removed. The skip statements survived the change.

@kusmour kusmour requested a review from JDevlieghere as a code owner April 25, 2025 00:55
@llvmbot llvmbot added the lldb label Apr 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2025

@llvm/pr-subscribers-lldb

Author: Wanyi (kusmour)

Changes

This patch makes the request_attach wait for events process and initialized just like request_launch. This ensure the DAP session can move forward somewhat correctly.

Recently TestDap_attach.test_terminate_commands became flaky.
It's hitting:

lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv
    raise ValueError(desc)
ValueError: no response for "disconnect"

I took a look at the DAP msg from that test case and noticed:

  • It's not using the regular attaching, instead it's using the attachCommands to launch debug the binary and it will stop at entry.
  • The initialized event returned after the disconnect request. Which means lldb-dap didn't really get ready yet.

NOTE: The dap_server.py is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the configurationDone request and response because it relies on a continue action to trigger the configurationDone request.


Full diff: https://github.com/llvm/llvm-project/pull/137278.diff

1 Files Affected:

  • (modified) lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py (+7-1)
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
index dadf6b1f8774c..20f5286da3203 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
@@ -612,7 +612,13 @@ def request_attach(
         if gdbRemoteHostname is not None:
             args_dict["gdb-remote-hostname"] = gdbRemoteHostname
         command_dict = {"command": "attach", "type": "request", "arguments": args_dict}
-        return self.send_recv(command_dict)
+        response = self.send_recv(command_dict)
+
+        if response["success"]:
+            # Wait for a 'process' and 'initialized' event in any order
+            self.wait_for_event(filter=["process", "initialized"])
+            self.wait_for_event(filter=["process", "initialized"])
+        return response
 
     def request_breakpointLocations(
         self, file_path, line, end_line=None, column=None, end_column=None

@kusmour kusmour force-pushed the lldb-dap-flaky-test-attach branch 2 times, most recently from 956f2af to 2c5ff68 Compare April 25, 2025 02:01
@kusmour kusmour force-pushed the lldb-dap-flaky-test-attach branch from b87f1fe to 0e6d419 Compare April 25, 2025 04:13
kusmour added 3 commits April 25, 2025 11:10
Looks like these 2 test cases should be re-enabled in llvm@0b8dfb5
But only the comments got removed. The skip statement survived the change.
Summary:
Looks like some test cases should be re-enabled in llvm@0b8dfb5
But only comments was removed. The skip statements survived the change.

Test Plan:
./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
All tests passed locally on my mac
@kusmour kusmour force-pushed the lldb-dap-flaky-test-attach branch from 0e6d419 to aa7540d Compare April 25, 2025 18:11
Copy link
Contributor

@dmpots dmpots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'm not too familiar with the dap protocol, so would be good to get another approver.

events = events[:] # Make a copy to avoid modifying the input
while events:
event = self.wait_for_event(filter=events, timeout=timeout)
events.remove(event["event"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hit a timeout, wont event be None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah it looks like its not handling the timeout correctly.

events = events[:] # Make a copy to avoid modifying the input
while events:
event = self.wait_for_event(filter=events, timeout=timeout)
events.remove(event["event"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah it looks like its not handling the timeout correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants