-
Notifications
You must be signed in to change notification settings - Fork 38.2k
test: handle failed assert_equal() assertions in bcc callback functions
#27831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
d10f45b to
4798c45
Compare
|
Concept ACK. Thanks for picking this up! Seeing the asserts fail in the CI logs is needed and would have helped figuring out the reason for #27380. (I think you meant to write |
You're right. I don't think I can change it in the PR title, but I at least fixed it in the description. |
check_equal() assertions in bcc callback functionsassert_equal() assertions in bcc callback functions
|
ACK 4798c45. I code-reviewed the changes and they look good to me. The approach taken (applying the two solutions outlined in the OP on a case-by-case basis) seems OK to me. With #27380 in mind, I've checked that this commit 0xB10C@4e00fbb on top of this PR fails with |
|
ACK 4798c45 I was a bit unfamilar with this python library so reviewed the bcc docs and reference guide before getting started... Agree that these changes allow us to see errors propagated out into the test framework logs. I broke a few of the tests to ensure that this was the case. nit: actually we currently dump errors to stderr, not stdout, but the end result is roughly the same :) Before vs After failure outputbefore: After: |
|
@willcl-ark If you want to "ack" a pull, and want the merge-script and bot to recognize it, you'll have to do it in all-upper-case |
|
I've tested this PR on the old broken variant of the #23962 using
Note the absence of the "Exception ignored on calling ctypes callback function..." message in the |
Exceptions are not propagated in ctype callback functions used by bcc. This means an AssertionError exception raised by check_equal() to signal a failed assertion is not getting caught and properly logged. Instead, the error is logged to stdout and execution of the handler stops. The current workaround to check whether all check_equal() assertions in a callback succeeded is to increment a success counter after the assertions (which only gets incremented if none exception is raised and stops execution). Then, outside the callback, the success counter can be used to check whether a callback executed successfully. One issue with the described workaround is that when an exception occurs, there is no way of telling which of the check_equal() statements caused the exception; moreover, there is no way of inspecting how the pieces of data that got compared in check_equal() differed (often a crucial clue when debugging what went wrong). Two fixes to this problem come to mind. The first involves having the callback function make event data accessible outside the callback and inspecting the event using check_equal() outside the callback. This solution still requires a counter in the callback to tell whether a callback was actually executed or if instead the call to perf_buffer_poll() timed out. The second fix entails wrapping all relevant check_equal() statements inside callback functions into try-catch blocks and manually logging AssertionErrors. While not as elegant in terms of design, this approach can be more pragmatic for more complex tests (e.g., ones involving multiple events, events of different types, or the order of events). The solution proposed here is to select the most pragmatic fix on a case-by-case basis: Tests in interface_usdt_net.py, interface_usdt_mempool.py and interface_usdt_validation.py have been refactored to use the first approach, while the second approach was chosen for interface_usdt_utxocache.py (partly to provide a reference for the second approach, but mainly because the utxocache tests are the most intricate tests, and refactoring them to use the first approach would negatively impact their readability). Lastly, interface_usdt_coinselection.py was kept unchanged because it does not use check_equal() statements inside callback functions.
9297303 to
61f4b9b
Compare
|
Reviewed the changes since my last review. ACK 61f4b9b. I've tested that the combined log contains both exceptions by modifying Using CI failure is unrelated. |
|
utACK 61f4b9b Verified that This is the same change shown by 9297303d..61f4b9b7 compare (even though my local branch was at 4798c45 and not 9297303). |
stickies-v
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 61f4b9b
Can't run this on macOS and am not familiar with BPF and bcc, but from a Python pov this looks like a reasonable change to me assuming the problem statement in OP.
Left a few suggested improvement nits but they can all be safely ignored/followed-up, especially since we already have so many ACKs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you don't really need the blocks_checked counter anymore now:
git diff
diff --git a/test/functional/interface_usdt_validation.py b/test/functional/interface_usdt_validation.py
index f9d9b525c..f32d83a50 100755
--- a/test/functional/interface_usdt_validation.py
+++ b/test/functional/interface_usdt_validation.py
@@ -86,7 +86,6 @@ class ValidationTracepointTest(BitcoinTestFramework):
self.duration)
BLOCKS_EXPECTED = 2
- blocks_checked = 0
expected_blocks = dict()
events = []
@@ -98,11 +97,10 @@ class ValidationTracepointTest(BitcoinTestFramework):
usdt_contexts=[ctx], debug=0)
def handle_blockconnected(_, data, __):
- nonlocal events, blocks_checked
+ nonlocal events
event = ctypes.cast(data, ctypes.POINTER(Block)).contents
self.log.info(f"handle_blockconnected(): {event}")
events.append(event)
- blocks_checked += 1
bpf["block_connected"].open_perf_buffer(
handle_blockconnected)
@@ -127,7 +125,7 @@ class ValidationTracepointTest(BitcoinTestFramework):
# only plausibility checks
assert event.duration > 0
del expected_blocks[block_hash]
- assert_equal(BLOCKS_EXPECTED, blocks_checked)
+ assert_equal(BLOCKS_EXPECTED, len(events))
assert_equal(0, len(expected_blocks))
bpf.cleanup()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all 4 methods added_test, removed_test, replaced_test and rejected_test I think it's a bit confusing that we're assigning all events to the same event object while simultaneously accounting for the possibility that there is more than one event. I think it would be more elegant to just store all events, and then be explicit about the expectation that we expect len(events)==1.
In addition to being more readable (imo), this also makes debugging a bit quicker if we suddenly do end up having more than 1 event unexpectedly?
edit: maybe more elegant removing EXPECTED_ADDED_EVENTS (etc) altogether here and just hardcoding it?
git diff
diff --git a/test/functional/interface_usdt_mempool.py b/test/functional/interface_usdt_mempool.py
index 62fdc2836..dc3520132 100755
--- a/test/functional/interface_usdt_mempool.py
+++ b/test/functional/interface_usdt_mempool.py
@@ -138,8 +138,7 @@ class MempoolTracepointTest(BitcoinTestFramework):
the expected txid, vsize, and fee."""
EXPECTED_ADDED_EVENTS = 1
- handled_added_events = 0
- event = None
+ events = []
self.log.info("Hooking into mempool:added tracepoint...")
node = self.nodes[0]
@@ -148,9 +147,7 @@ class MempoolTracepointTest(BitcoinTestFramework):
bpf = BPF(text=MEMPOOL_TRACEPOINTS_PROGRAM, usdt_contexts=[ctx], debug=0)
def handle_added_event(_, data, __):
- nonlocal event, handled_added_events
- event = bpf["added_events"].event(data)
- handled_added_events += 1
+ events.append(bpf["added_events"].event(data))
bpf["added_events"].open_perf_buffer(handle_added_event)
@@ -165,7 +162,8 @@ class MempoolTracepointTest(BitcoinTestFramework):
self.generate(node, 1)
self.log.info("Ensuring mempool:added event was handled successfully...")
- assert_equal(EXPECTED_ADDED_EVENTS, handled_added_events)
+ assert_equal(EXPECTED_ADDED_EVENTS, len(events))
+ event = events[0] # single item since EXPECTED_ADDED_EVENTS is 1
assert_equal(bytes(event.hash)[::-1].hex(), tx["txid"])
assert_equal(event.vsize, tx["tx"].get_vsize())
assert_equal(event.fee, fee)
| checked_outbound_version_msg += 1 | ||
|
|
||
| def handle_inbound(_, data, __): | ||
| nonlocal events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Since we're mutating events (and not reassigning), there's no need to declare nonlocal so I think this line can be safely removed. Or, if you prefer the explicitness of altering an object from a different scope, I'd at least stay consistent and do the same in handle_outbound?
| for event, inbound in events: | ||
| check_p2p_message(event, inbound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: clearer naming to indicate it's a bool
| for event, inbound in events: | |
| check_p2p_message(event, inbound) | |
| for event, is_inbound in events: | |
| check_p2p_message(event, is_inbound) |
| try: | ||
| assert_equal(add["txid"], bytes(event.txid[::-1]).hex()) | ||
| assert_equal(add["index"], event.index) | ||
| assert_equal(add["height"], event.height) | ||
| assert_equal(add["value"], event.value) | ||
| assert_equal(add["is_coinbase"], event.is_coinbase) | ||
| except AssertionError: | ||
| self.log.exception("Assertion failed") | ||
| else: | ||
| handle_add_succeeds += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this looks identical to the logic in handle_utxocache_spent so could wrap this in a compare_utxo_with_event function to deduplicate the code?
…in bcc callback functions 61f4b9b Manage exceptions in bcc callback functions (virtu) Pull request description: Address bitcoin#27380 (and similar future issues) by handling failed `assert_equal()` assertions in bcc callback functions ### Problem Exceptions are not propagated in ctype callback functions used by bcc. This means an AssertionError exception raised by `assert_equal()` to signal a failed assertion is not getting caught and properly logged. Instead, the error is logged to stdout and execution of the callback stops. The current workaround to check whether all `assert_equal()` assertions in a callback succeeded is to increment a success counter after the assertions (which only gets incremented if none exception is raised and stops execution). Then, outside the callback, the success counter can be used to check whether a callback executed successfully. One issue with the described workaround is that when an exception occurs, there is no way of telling which of the `assert_equal()` statements caused the exception; moreover, there is no way of inspecting how the pieces of data that got compared in `assert_equal()` differed (often a crucial clue when debugging what went wrong). This problem is happening in bitcoin#27380: Sporadically, in the `mempool:rejected` test, execution does not reach the end of the callback function and the success counter is not incremented. Thus, the test fails when comparing the counter to its expected value of one. Without knowing which of the asserts failed any why it failed, this issue is hard to debug. ### Solution Two fixes come to mind. The first involves having the callback function make event data accessible outside the callback and inspecting the event using `assert_equal()` outside the callback. This solution still requires a counter in the callback in order to tell whether a callback was actually executed or if instead the call to perf_buffer_poll() timed out. The second fix entails wrapping all relevant `assert_equal()` statements inside callback functions into try-catch blocks and manually logging AssertionErrors. While not as elegant in terms of design, this approach can be more pragmatic for more complex tests (e.g., ones involving multiple events, events of different types, or the order of events). The solution proposed here is to select the most pragmatic fix on a case-by-case basis: Tests in `interface_usdt_net.py`, `interface_usdt_mempool.py` and `interface_usdt_validation.py` have been refactored to use the first approach, while the second approach was chosen for `interface_usdt_utxocache.py` (partly to provide a reference for the second approach, but mainly because the utxocache tests are the most intricate tests, and refactoring them to use the first approach would negatively impact their readability). Lastly, `interface_usdt_coinselection.py` was kept unchanged because it does not use `assert_equal()` statements inside callback functions. ACKs for top commit: 0xB10C: Reviewed the changes since my last review. ACK 61f4b9b. I've tested that the combined log contains both exceptions by modifying `interface_usdt_utxocache.py`. willcl-ark: utACK 61f4b9b stickies-v: utACK 61f4b9b Tree-SHA512: 85cdaabf370d4f09a9eab6af9ce7c796cd9d08cb91f38f021f71adda34c5f643331022dd09cadb95be2185dad6016c95cbb8942e41e4fbd566a49bf431c5141a
|
@stickies-v did you want to open a follow up? |
Done: #27944 |
…ups (27831 follow-ups) 9f55773 test: refactor: usdt_mempool: store all events (stickies-v) bc43270 test: refactor: remove unnecessary nonlocal (stickies-v) 326db63 test: log sanity check assertion failures (stickies-v) f5525ad test: store utxocache events (stickies-v) f1b99ac test: refactor: deduplicate handle_utxocache_* logic (stickies-v) ad90ba3 test: refactor: rename inbound to is_inbound (stickies-v) afc0224 test: refactor: remove unnecessary blocks_checked counter (stickies-v) Pull request description: Various cleanups to the USDT functional tests, largely (but not exclusively) follow-ups to bitcoin/bitcoin#27831 (review). Except for slightly different logging behaviour in "test: store utxocache events" and "test: log sanity check assertion failures", this is a refactor PR, removing unnecessary code and (imo) making it more readable and maintainable. The rationale for each change is in the corresponding commit message. Note: except for "test: store utxocache events" (which relies on its parent, and I separated into two commits because we may want the parent but not the child), all commits are stand-alone and I'm okay with dropping one/multiple commits if they turn out to be controversial or undesired. ACKs for top commit: 0xB10C: ACK 9f55773. Reviewed the code and ran the USDT interface tests. I stepped through the commits and think all changes are reasonable. Tree-SHA512: 6c37a0265b6c26d4f9552a056a690b8f86f7304bd33b4419febd8b17369cf6af799cb87c16df35d0c2a1b839ad31de24661d4384eafa88816c2051c522fd3bf5
…31 follow-ups) 9f55773 test: refactor: usdt_mempool: store all events (stickies-v) bc43270 test: refactor: remove unnecessary nonlocal (stickies-v) 326db63 test: log sanity check assertion failures (stickies-v) f5525ad test: store utxocache events (stickies-v) f1b99ac test: refactor: deduplicate handle_utxocache_* logic (stickies-v) ad90ba3 test: refactor: rename inbound to is_inbound (stickies-v) afc0224 test: refactor: remove unnecessary blocks_checked counter (stickies-v) Pull request description: Various cleanups to the USDT functional tests, largely (but not exclusively) follow-ups to bitcoin#27831 (review). Except for slightly different logging behaviour in "test: store utxocache events" and "test: log sanity check assertion failures", this is a refactor PR, removing unnecessary code and (imo) making it more readable and maintainable. The rationale for each change is in the corresponding commit message. Note: except for "test: store utxocache events" (which relies on its parent, and I separated into two commits because we may want the parent but not the child), all commits are stand-alone and I'm okay with dropping one/multiple commits if they turn out to be controversial or undesired. ACKs for top commit: 0xB10C: ACK 9f55773. Reviewed the code and ran the USDT interface tests. I stepped through the commits and think all changes are reasonable. Tree-SHA512: 6c37a0265b6c26d4f9552a056a690b8f86f7304bd33b4419febd8b17369cf6af799cb87c16df35d0c2a1b839ad31de24661d4384eafa88816c2051c522fd3bf5
Address #27380 (and similar future issues) by handling failed
assert_equal()assertions in bcc callback functionsProblem
Exceptions are not propagated in ctype callback functions used by bcc. This means an AssertionError exception raised by
assert_equal()to signal a failed assertion is not getting caught and properly logged. Instead, the error is logged to stdout and execution of the callback stops.The current workaround to check whether all
assert_equal()assertions in a callback succeeded is to increment a success counter after the assertions (which only gets incremented if none exception is raised and stops execution). Then, outside the callback, the success counter can be used to check whether a callback executed successfully.One issue with the described workaround is that when an exception occurs, there is no way of telling which of the
assert_equal()statements caused the exception; moreover, there is no way of inspecting how the pieces of data that got compared inassert_equal()differed (often a crucial clue when debugging what went wrong).This problem is happening in #27380: Sporadically, in the
mempool:rejectedtest, execution does not reach the end of the callback function and the success counter is not incremented. Thus, the test fails when comparing the counter to its expected value of one. Without knowing which of the asserts failed any why it failed, this issue is hard to debug.Solution
Two fixes come to mind. The first involves having the callback function make event data accessible outside the callback and inspecting the event using
assert_equal()outside the callback. This solution still requires a counter in the callback in order to tell whether a callback was actually executed or if instead the call to perf_buffer_poll() timed out.The second fix entails wrapping all relevant
assert_equal()statements inside callback functions into try-catch blocks and manually logging AssertionErrors. While not as elegant in terms of design, this approach can be more pragmatic for more complex tests (e.g., ones involving multiple events, events of different types, or the order of events).The solution proposed here is to select the most pragmatic fix on a case-by-case basis: Tests in
interface_usdt_net.py,interface_usdt_mempool.pyandinterface_usdt_validation.pyhave been refactored to use the first approach, while the second approach was chosen forinterface_usdt_utxocache.py(partly to provide a reference for the second approach, but mainly because the utxocache tests are the most intricate tests, and refactoring them to use the first approach would negatively impact their readability). Lastly,interface_usdt_coinselection.pywas kept unchanged because it does not useassert_equal()statements inside callback functions.