Skip to content

Conversation

@virtu
Copy link
Contributor

@virtu virtu commented Jun 6, 2023

Address #27380 (and similar future issues) by handling failed assert_equal() assertions in bcc callback functions

Problem

Exceptions are not propagated in ctype callback functions used by bcc. This means an AssertionError exception raised by assert_equal() to signal a failed assertion is not getting caught and properly logged. Instead, the error is logged to stdout and execution of the callback stops.

The current workaround to check whether all assert_equal() assertions in a callback succeeded is to increment a success counter after the assertions (which only gets incremented if none exception is raised and stops execution). Then, outside the callback, the success counter can be used to check whether a callback executed successfully.

One issue with the described workaround is that when an exception occurs, there is no way of telling which of the assert_equal() statements caused the exception; moreover, there is no way of inspecting how the pieces of data that got compared in assert_equal() differed (often a crucial clue when debugging what went wrong).

This problem is happening in #27380: Sporadically, in the mempool:rejected test, execution does not reach the end of the callback function and the success counter is not incremented. Thus, the test fails when comparing the counter to its expected value of one. Without knowing which of the asserts failed any why it failed, this issue is hard to debug.

Solution

Two fixes come to mind. The first involves having the callback function make event data accessible outside the callback and inspecting the event using assert_equal() outside the callback. This solution still requires a counter in the callback in order to tell whether a callback was actually executed or if instead the call to perf_buffer_poll() timed out.

The second fix entails wrapping all relevant assert_equal() statements inside callback functions into try-catch blocks and manually logging AssertionErrors. While not as elegant in terms of design, this approach can be more pragmatic for more complex tests (e.g., ones involving multiple events, events of different types, or the order of events).

The solution proposed here is to select the most pragmatic fix on a case-by-case basis: Tests in interface_usdt_net.py, interface_usdt_mempool.py and interface_usdt_validation.py have been refactored to use the first approach, while the second approach was chosen for interface_usdt_utxocache.py (partly to provide a reference for the second approach, but mainly because the utxocache tests are the most intricate tests, and refactoring them to use the first approach would negatively impact their readability). Lastly, interface_usdt_coinselection.py was kept unchanged because it does not use assert_equal() statements inside callback functions.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 6, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK 0xB10C, willcl-ark, stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

@0xB10C
Copy link
Contributor

0xB10C commented Jun 6, 2023

Concept ACK. Thanks for picking this up! Seeing the asserts fail in the CI logs is needed and would have helped figuring out the reason for #27380.

(I think you meant to write assert_equal() instead of check_equal())

@virtu
Copy link
Contributor Author

virtu commented Jun 6, 2023

(I think you meant to write assert_equal() instead of check_equal())

You're right. I don't think I can change it in the PR title, but I at least fixed it in the description.

@fanquake fanquake changed the title test: handle failed check_equal() assertions in bcc callback functions test: handle failed assert_equal() assertions in bcc callback functions Jun 6, 2023
@0xB10C
Copy link
Contributor

0xB10C commented Jun 9, 2023

ACK 4798c45. I code-reviewed the changes and they look good to me. The approach taken (applying the two solutions outlined in the OP on a case-by-case basis) seems OK to me.

With #27380 in mind, I've checked that this commit 0xB10C@4e00fbb on top of this PR fails with AssertionError: not(min relay fee not met == min erlay fee not met) as expected in the CI logs compared to a similar change 0xB10C@4091d62 on master, which only shows AssertionError: not(1 == 0) in the CI logs.

@DrahtBot DrahtBot removed the CI failed label Jun 9, 2023
@fanquake fanquake requested a review from willcl-ark June 12, 2023 12:54
@willcl-ark
Copy link
Member

willcl-ark commented Jun 12, 2023

ACK 4798c45

I was a bit unfamilar with this python library so reviewed the bcc docs and reference guide before getting started...

Agree that these changes allow us to see errors propagated out into the test framework logs. I broke a few of the tests to ensure that this was the case.

nit: actually we currently dump errors to stderr, not stdout, but the end result is roughly the same :)

Before vs After failure output

before:

✗ sudo test/functional/interface_usdt_mempool.py
[sudo] password for will:
2023-06-12T19:58:57.016000Z TestFramework (INFO): PRNG seed is: 3746665899140184430
2023-06-12T19:58:57.016000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_kocthg6x
2023-06-12T19:58:57.489000Z TestFramework (INFO): Hooking into mempool:added tracepoint...
2023-06-12T19:58:58.266000Z TestFramework (INFO): Sending transaction...
2023-06-12T19:58:58.275000Z TestFramework (INFO): Polling buffer...
Exception ignored on calling ctypes callback function: <function PerfEventArray._open_perf_buffer.<locals>.raw_cb_ at 0x7f3df4f2cb80>
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/bcc/table.py", line 982, in raw_cb_
    callback(cpu, data, size)
  File "/home/will/src/bitcoin/test/functional/interface_usdt_mempool.py", line 153, in handle_added_event
    assert_equal(vsize, event.vsize + 1)
  File "/home/will/src/bitcoin/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(104 == 105)
2023-06-12T19:58:58.276000Z TestFramework (INFO): Cleaning up mempool...
2023-06-12T19:58:58.326000Z TestFramework (INFO): Ensuring mempool:added event was handled successfully...
2023-06-12T19:58:58.326000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/home/will/src/bitcoin/test/functional/test_framework/test_framework.py", line 132, in main
    self.run_test()
  File "/home/will/src/bitcoin/test/functional/interface_usdt_mempool.py", line 340, in run_test
    self.added_test()
  File "/home/will/src/bitcoin/test/functional/interface_usdt_mempool.py", line 175, in added_test
    assert_equal(EXPECTED_ADDED_EVENTS, handled_added_events)
  File "/home/will/src/bitcoin/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(1 == 0)
2023-06-12T19:58:58.379000Z TestFramework (INFO): Stopping nodes
2023-06-12T19:58:58.582000Z TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test_kocthg6x
2023-06-12T19:58:58.582000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/bitcoin_func_test_kocthg6x/test_framework.log
2023-06-12T19:58:58.582000Z TestFramework (ERROR):
2023-06-12T19:58:58.582000Z TestFramework (ERROR): Hint: Call /home/will/src/bitcoin/test/functional/combine_logs.py '/tmp/bitcoin_func_test_kocthg6x' to consolidate all logs
2023-06-12T19:58:58.582000Z TestFramework (ERROR):
2023-06-12T19:58:58.582000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
2023-06-12T19:58:58.582000Z TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
2023-06-12T19:58:58.582000Z TestFramework (ERROR):

After:

₿ sudo test/functional/interface_usdt_mempool.py
2023-06-12T20:00:49.382000Z TestFramework (INFO): PRNG seed is: 7011104464737242561
2023-06-12T20:00:49.383000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gwj55flk
2023-06-12T20:00:49.808000Z TestFramework (INFO): Hooking into mempool:added tracepoint...
2023-06-12T20:00:50.590000Z TestFramework (INFO): Sending transaction...
2023-06-12T20:00:50.602000Z TestFramework (INFO): Polling buffer...
2023-06-12T20:00:50.602000Z TestFramework (INFO): Cleaning up mempool...
2023-06-12T20:00:50.607000Z TestFramework (INFO): Ensuring mempool:added event was handled successfully...
2023-06-12T20:00:50.607000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/home/will/src/bitcoin/test/functional/test_framework/test_framework.py", line 132, in main
    self.run_test()
  File "/home/will/src/bitcoin/test/functional/interface_usdt_mempool.py", line 326, in run_test
    self.added_test()
  File "/home/will/src/bitcoin/test/functional/interface_usdt_mempool.py", line 171, in added_test
    assert_equal(event.fee, fee + 1)
  File "/home/will/src/bitcoin/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(31200 == 31201)
2023-06-12T20:00:50.658000Z TestFramework (INFO): Stopping nodes
2023-06-12T20:00:50.759000Z TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test_gwj55flk
2023-06-12T20:00:50.759000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/bitcoin_func_test_gwj55flk/test_framework.log
2023-06-12T20:00:50.759000Z TestFramework (ERROR):
2023-06-12T20:00:50.759000Z TestFramework (ERROR): Hint: Call /home/will/src/bitcoin/test/functional/combine_logs.py '/tmp/bitcoin_func_test_gwj55flk' to consolidate all logs
2023-06-12T20:00:50.759000Z TestFramework (ERROR):
2023-06-12T20:00:50.759000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
2023-06-12T20:00:50.759000Z TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
2023-06-12T20:00:50.759000Z TestFramework (ERROR):

@maflcko
Copy link
Member

maflcko commented Jun 13, 2023

@willcl-ark If you want to "ack" a pull, and want the merge-script and bot to recognize it, you'll have to do it in all-upper-case

@hebasto
Copy link
Member

hebasto commented Jun 13, 2023

I've tested this PR on the old broken variant of the #23962 using TEST_RUNNER_EXTRA="--filter interface_usdt_mempool":

  • w/o this PR:
stdout:
2023-06-12T14:55:58.610000Z TestFramework (INFO): PRNG seed is: 7368400044812499558
2023-06-12T14:55:58.611000Z TestFramework (INFO): Initializing test directory /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230612_145556/interface_usdt_mempool_0
2023-06-12T14:56:00.723000Z TestFramework (INFO): Hooking into mempool:added tracepoint...
2023-06-12T14:56:01.437000Z TestFramework (INFO): Sending transaction...
2023-06-12T14:56:01.444000Z TestFramework (INFO): Polling buffer...
2023-06-12T14:56:01.444000Z TestFramework (INFO): Cleaning up mempool...
2023-06-12T14:56:01.485000Z TestFramework (INFO): Ensuring mempool:added event was handled successfully...
2023-06-12T14:56:01.485000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
    self.run_test()
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 340, in run_test
    self.added_test()
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 175, in added_test
    assert_equal(EXPECTED_ADDED_EVENTS, handled_added_events)
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(1 == 0)
2023-06-12T14:56:01.536000Z TestFramework (INFO): Stopping nodes
2023-06-12T14:56:01.638000Z TestFramework (WARNING): Not cleaning up dir /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230612_145556/interface_usdt_mempool_0
2023-06-12T14:56:01.638000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230612_145556/interface_usdt_mempool_0/test_framework.log
2023-06-12T14:56:01.638000Z TestFramework (ERROR): 
2023-06-12T14:56:01.638000Z TestFramework (ERROR): Hint: Call /tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/combine_logs.py '/tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230612_145556/interface_usdt_mempool_0' to consolidate all logs
2023-06-12T14:56:01.638000Z TestFramework (ERROR): 
2023-06-12T14:56:01.638000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
2023-06-12T14:56:01.638000Z TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
2023-06-12T14:56:01.638000Z TestFramework (ERROR): 

stderr:
Exception ignored on calling ctypes callback function: <function PerfEventArray._open_perf_buffer.<locals>.raw_cb_ at 0x7f72a4a4d940>
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/bcc/table.py", line 982, in raw_cb_
    callback(cpu, data, size)
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 153, in handle_added_event
    assert_equal(vsize, event.vsize)
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(104 == 0)

Combine the logs and print the last 99999999 lines ...
...
  • with this PR:
stdout:
2023-06-13T20:46:30.245000Z TestFramework (INFO): PRNG seed is: 8934552432487963566
2023-06-13T20:46:30.245000Z TestFramework (INFO): Initializing test directory /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230613_204627/interface_usdt_mempool_0
2023-06-13T20:46:31.761000Z TestFramework (INFO): Hooking into mempool:added tracepoint...
2023-06-13T20:46:32.426000Z TestFramework (INFO): Sending transaction...
2023-06-13T20:46:32.431000Z TestFramework (INFO): Polling buffer...
2023-06-13T20:46:32.432000Z TestFramework (INFO): Cleaning up mempool...
2023-06-13T20:46:32.437000Z TestFramework (INFO): Ensuring mempool:added event was handled successfully...
2023-06-13T20:46:32.437000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
    self.run_test()
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 326, in run_test
    self.added_test()
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 170, in added_test
    assert_equal(event.vsize, tx["tx"].get_vsize())
  File "/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 57, in assert_equal
    raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
AssertionError: not(0 == 104)
2023-06-13T20:46:32.488000Z TestFramework (INFO): Stopping nodes
2023-06-13T20:46:32.589000Z TestFramework (WARNING): Not cleaning up dir /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230613_204627/interface_usdt_mempool_0
2023-06-13T20:46:32.589000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230613_204627/interface_usdt_mempool_0/test_framework.log
2023-06-13T20:46:32.589000Z TestFramework (ERROR): 
2023-06-13T20:46:32.589000Z TestFramework (ERROR): Hint: Call /tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/combine_logs.py '/tmp/cirrus-ci-build/ci/scratch/test_runner/test_runner_₿_🏃_20230613_204627/interface_usdt_mempool_0' to consolidate all logs
2023-06-13T20:46:32.589000Z TestFramework (ERROR): 
2023-06-13T20:46:32.590000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
2023-06-13T20:46:32.590000Z TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
2023-06-13T20:46:32.590000Z TestFramework (ERROR): 

stderr:

Combine the logs and print the last 99999999 lines ...
...

Note the absence of the "Exception ignored on calling ctypes callback function..." message in the stderr in the latter case in comparison to the former one.

@DrahtBot DrahtBot removed the request for review from willcl-ark June 14, 2023 14:50
Exceptions are not propagated in ctype callback functions used by bcc.
This means an AssertionError exception raised by check_equal() to signal
a failed assertion is not getting caught and properly logged. Instead,
the error is logged to stdout and execution of the handler stops.

The current workaround to check whether all check_equal() assertions in
a callback succeeded is to increment a success counter after the
assertions (which only gets incremented if none exception is raised and
stops execution). Then, outside the callback, the success counter can be
used to check whether a callback executed successfully.

One issue with the described workaround is that when an exception
occurs, there is no way of telling which of the check_equal() statements
caused the exception; moreover, there is no way of inspecting how the
pieces of data that got compared in check_equal() differed (often
a crucial clue when debugging what went wrong).

Two fixes to this problem come to mind. The first involves having the
callback function make event data accessible outside the callback and
inspecting the event using check_equal() outside the callback. This
solution still requires a counter in the callback to tell whether
a callback was actually executed or if instead the call to
perf_buffer_poll() timed out.

The second fix entails wrapping all relevant check_equal() statements
inside callback functions into try-catch blocks and manually logging
AssertionErrors. While not as elegant in terms of design, this approach
can be more pragmatic for more complex tests (e.g., ones involving
multiple events, events of different types, or the order of events).

The solution proposed here is to select the most pragmatic fix on
a case-by-case basis: Tests in interface_usdt_net.py,
interface_usdt_mempool.py and interface_usdt_validation.py have been
refactored to use the first approach, while the second approach was
chosen for interface_usdt_utxocache.py (partly to provide a reference
for the second approach, but mainly because the utxocache tests are the
most intricate tests, and refactoring them to use the first approach
would negatively impact their readability). Lastly,
interface_usdt_coinselection.py was kept unchanged because it does not
use check_equal() statements inside callback functions.
@virtu virtu force-pushed the handle_bcc_callback_exceptions branch from 9297303 to 61f4b9b Compare June 19, 2023 12:41
@0xB10C
Copy link
Contributor

0xB10C commented Jun 20, 2023

Reviewed the changes since my last review. ACK 61f4b9b. I've tested that the combined log contains both exceptions by modifying interface_usdt_utxocache.py.

Using Logger.exception() is better. It wasn't clear to me that the logging functions read sys.exc_info() and do log exceptions if there are any as described here.

CI failure is unrelated.

@DrahtBot DrahtBot requested a review from willcl-ark June 20, 2023 10:28
@willcl-ark
Copy link
Member

utACK 61f4b9b

Verified that git fetch; and git diff HEAD..FETCH_HEAD only modified the logging type from my previous ack.

This is the same change shown by 9297303d..61f4b9b7 compare (even though my local branch was at 4798c45 and not 9297303).

@DrahtBot DrahtBot removed the request for review from willcl-ark June 20, 2023 13:31
@fanquake fanquake requested a review from stickies-v June 21, 2023 16:43
Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 61f4b9b

Can't run this on macOS and am not familiar with BPF and bcc, but from a Python pov this looks like a reasonable change to me assuming the problem statement in OP.

Left a few suggested improvement nits but they can all be safely ignored/followed-up, especially since we already have so many ACKs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you don't really need the blocks_checked counter anymore now:

git diff
diff --git a/test/functional/interface_usdt_validation.py b/test/functional/interface_usdt_validation.py
index f9d9b525c..f32d83a50 100755
--- a/test/functional/interface_usdt_validation.py
+++ b/test/functional/interface_usdt_validation.py
@@ -86,7 +86,6 @@ class ValidationTracepointTest(BitcoinTestFramework):
                     self.duration)
 
         BLOCKS_EXPECTED = 2
-        blocks_checked = 0
         expected_blocks = dict()
         events = []
 
@@ -98,11 +97,10 @@ class ValidationTracepointTest(BitcoinTestFramework):
                   usdt_contexts=[ctx], debug=0)
 
         def handle_blockconnected(_, data, __):
-            nonlocal events, blocks_checked
+            nonlocal events
             event = ctypes.cast(data, ctypes.POINTER(Block)).contents
             self.log.info(f"handle_blockconnected(): {event}")
             events.append(event)
-            blocks_checked += 1
 
         bpf["block_connected"].open_perf_buffer(
             handle_blockconnected)
@@ -127,7 +125,7 @@ class ValidationTracepointTest(BitcoinTestFramework):
             # only plausibility checks
             assert event.duration > 0
             del expected_blocks[block_hash]
-        assert_equal(BLOCKS_EXPECTED, blocks_checked)
+        assert_equal(BLOCKS_EXPECTED, len(events))
         assert_equal(0, len(expected_blocks))
 
         bpf.cleanup()

Copy link
Contributor

@stickies-v stickies-v Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all 4 methods added_test, removed_test, replaced_test and rejected_test I think it's a bit confusing that we're assigning all events to the same event object while simultaneously accounting for the possibility that there is more than one event. I think it would be more elegant to just store all events, and then be explicit about the expectation that we expect len(events)==1.

In addition to being more readable (imo), this also makes debugging a bit quicker if we suddenly do end up having more than 1 event unexpectedly?

edit: maybe more elegant removing EXPECTED_ADDED_EVENTS (etc) altogether here and just hardcoding it?

git diff
diff --git a/test/functional/interface_usdt_mempool.py b/test/functional/interface_usdt_mempool.py
index 62fdc2836..dc3520132 100755
--- a/test/functional/interface_usdt_mempool.py
+++ b/test/functional/interface_usdt_mempool.py
@@ -138,8 +138,7 @@ class MempoolTracepointTest(BitcoinTestFramework):
         the expected txid, vsize, and fee."""
 
         EXPECTED_ADDED_EVENTS = 1
-        handled_added_events = 0
-        event = None
+        events = []
 
         self.log.info("Hooking into mempool:added tracepoint...")
         node = self.nodes[0]
@@ -148,9 +147,7 @@ class MempoolTracepointTest(BitcoinTestFramework):
         bpf = BPF(text=MEMPOOL_TRACEPOINTS_PROGRAM, usdt_contexts=[ctx], debug=0)
 
         def handle_added_event(_, data, __):
-            nonlocal event, handled_added_events
-            event = bpf["added_events"].event(data)
-            handled_added_events += 1
+            events.append(bpf["added_events"].event(data))
 
         bpf["added_events"].open_perf_buffer(handle_added_event)
 
@@ -165,7 +162,8 @@ class MempoolTracepointTest(BitcoinTestFramework):
         self.generate(node, 1)
 
         self.log.info("Ensuring mempool:added event was handled successfully...")
-        assert_equal(EXPECTED_ADDED_EVENTS, handled_added_events)
+        assert_equal(EXPECTED_ADDED_EVENTS, len(events))
+        event = events[0]  # single item since EXPECTED_ADDED_EVENTS is 1
         assert_equal(bytes(event.hash)[::-1].hex(), tx["txid"])
         assert_equal(event.vsize, tx["tx"].get_vsize())
         assert_equal(event.fee, fee)

checked_outbound_version_msg += 1

def handle_inbound(_, data, __):
nonlocal events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since we're mutating events (and not reassigning), there's no need to declare nonlocal so I think this line can be safely removed. Or, if you prefer the explicitness of altering an object from a different scope, I'd at least stay consistent and do the same in handle_outbound?

Comment on lines +160 to +161
for event, inbound in events:
check_p2p_message(event, inbound)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: clearer naming to indicate it's a bool

Suggested change
for event, inbound in events:
check_p2p_message(event, inbound)
for event, is_inbound in events:
check_p2p_message(event, is_inbound)

Comment on lines +266 to +275
try:
assert_equal(add["txid"], bytes(event.txid[::-1]).hex())
assert_equal(add["index"], event.index)
assert_equal(add["height"], event.height)
assert_equal(add["value"], event.value)
assert_equal(add["is_coinbase"], event.is_coinbase)
except AssertionError:
self.log.exception("Assertion failed")
else:
handle_add_succeeds += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this looks identical to the logic in handle_utxocache_spent so could wrap this in a compare_utxo_with_event function to deduplicate the code?

@fanquake fanquake merged commit a7261da into bitcoin:master Jun 22, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 22, 2023
…in bcc callback functions

61f4b9b Manage exceptions in bcc callback functions (virtu)

Pull request description:

  Address bitcoin#27380 (and similar future issues) by handling failed `assert_equal()` assertions in bcc callback functions

  ### Problem

  Exceptions are not propagated in ctype callback functions used by bcc. This means an AssertionError exception raised by `assert_equal()` to signal a failed assertion is not getting caught and properly logged. Instead, the error is logged to stdout and execution of the callback stops.

  The current workaround to check whether all `assert_equal()` assertions in a callback succeeded is to increment a success counter after the assertions (which only gets incremented if none exception is raised and stops execution). Then, outside the callback, the success counter can be used to check whether a callback executed successfully.

  One issue with the described workaround is that when an exception occurs, there is no way of telling which of the `assert_equal()` statements caused the exception; moreover, there is no way of inspecting how the pieces of data that got compared in `assert_equal()` differed (often a crucial clue when debugging what went wrong).

  This problem is happening in bitcoin#27380: Sporadically, in the `mempool:rejected` test, execution does not reach the end of the callback function and the success counter is not incremented. Thus, the test fails when comparing the counter to its expected value of one. Without knowing which of the asserts failed any why it failed, this issue is hard to debug.

  ### Solution

  Two fixes come to mind. The first involves having the callback function make event data accessible outside the callback and inspecting the event using `assert_equal()` outside the callback. This solution still requires a counter in the callback in order  to tell whether a callback was actually executed or if instead the call to perf_buffer_poll() timed out.

  The second fix entails wrapping all relevant `assert_equal()` statements inside callback functions into try-catch blocks and manually logging AssertionErrors. While not as elegant in terms of design, this approach can be more pragmatic for more complex tests (e.g., ones involving multiple events, events of different types, or the order of events).

  The solution proposed here is to select the most pragmatic fix on a case-by-case basis: Tests in `interface_usdt_net.py`, `interface_usdt_mempool.py` and `interface_usdt_validation.py` have been refactored to use the first approach, while the second approach was chosen for `interface_usdt_utxocache.py` (partly to provide a reference for the second approach, but mainly because the utxocache tests are the most intricate tests, and refactoring them to use the first approach would negatively impact their readability). Lastly, `interface_usdt_coinselection.py` was kept unchanged because it does not use `assert_equal()` statements inside callback functions.

ACKs for top commit:
  0xB10C:
    Reviewed the changes since my last review. ACK 61f4b9b. I've tested that the combined log contains both exceptions by modifying `interface_usdt_utxocache.py`.
  willcl-ark:
    utACK 61f4b9b
  stickies-v:
    utACK 61f4b9b

Tree-SHA512: 85cdaabf370d4f09a9eab6af9ce7c796cd9d08cb91f38f021f71adda34c5f643331022dd09cadb95be2185dad6016c95cbb8942e41e4fbd566a49bf431c5141a
@fanquake
Copy link
Member

@stickies-v did you want to open a follow up?

@stickies-v
Copy link
Contributor

@stickies-v did you want to open a follow up?

Done: #27944

fanquake added a commit to bitcoin-core/gui that referenced this pull request Sep 10, 2023
…ups (27831 follow-ups)

9f55773 test: refactor: usdt_mempool: store all events (stickies-v)
bc43270 test: refactor: remove unnecessary nonlocal (stickies-v)
326db63 test: log sanity check assertion failures (stickies-v)
f5525ad test: store utxocache events (stickies-v)
f1b99ac test: refactor: deduplicate handle_utxocache_* logic (stickies-v)
ad90ba3 test: refactor:  rename inbound to is_inbound (stickies-v)
afc0224 test: refactor: remove unnecessary blocks_checked counter (stickies-v)

Pull request description:

  Various cleanups to the USDT functional tests, largely (but not exclusively) follow-ups to bitcoin/bitcoin#27831 (review). Except for slightly different logging behaviour in "test: store utxocache events" and "test: log sanity check assertion failures", this is a refactor PR, removing unnecessary code and (imo) making it more readable and maintainable.

  The rationale for each change is in the corresponding commit message.

  Note: except for "test: store utxocache events" (which relies on its parent, and I separated into two commits because we may want the parent but not the child), all commits are stand-alone and I'm okay with dropping one/multiple commits if they turn out to be controversial or undesired.

ACKs for top commit:
  0xB10C:
    ACK 9f55773. Reviewed the code and ran the USDT interface tests. I stepped through the commits and think all changes are reasonable.

Tree-SHA512: 6c37a0265b6c26d4f9552a056a690b8f86f7304bd33b4419febd8b17369cf6af799cb87c16df35d0c2a1b839ad31de24661d4384eafa88816c2051c522fd3bf5
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Sep 19, 2023
…31 follow-ups)

9f55773 test: refactor: usdt_mempool: store all events (stickies-v)
bc43270 test: refactor: remove unnecessary nonlocal (stickies-v)
326db63 test: log sanity check assertion failures (stickies-v)
f5525ad test: store utxocache events (stickies-v)
f1b99ac test: refactor: deduplicate handle_utxocache_* logic (stickies-v)
ad90ba3 test: refactor:  rename inbound to is_inbound (stickies-v)
afc0224 test: refactor: remove unnecessary blocks_checked counter (stickies-v)

Pull request description:

  Various cleanups to the USDT functional tests, largely (but not exclusively) follow-ups to bitcoin#27831 (review). Except for slightly different logging behaviour in "test: store utxocache events" and "test: log sanity check assertion failures", this is a refactor PR, removing unnecessary code and (imo) making it more readable and maintainable.

  The rationale for each change is in the corresponding commit message.

  Note: except for "test: store utxocache events" (which relies on its parent, and I separated into two commits because we may want the parent but not the child), all commits are stand-alone and I'm okay with dropping one/multiple commits if they turn out to be controversial or undesired.

ACKs for top commit:
  0xB10C:
    ACK 9f55773. Reviewed the code and ran the USDT interface tests. I stepped through the commits and think all changes are reasonable.

Tree-SHA512: 6c37a0265b6c26d4f9552a056a690b8f86f7304bd33b4419febd8b17369cf6af799cb87c16df35d0c2a1b839ad31de24661d4384eafa88816c2051c522fd3bf5
@virtu virtu deleted the handle_bcc_callback_exceptions branch April 9, 2024 08:02
@bitcoin bitcoin locked and limited conversation to collaborators Apr 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants