fix: gracefully handle error conditions in PocketIC server #2556

mraszyk · 2024-11-11T17:39:38Z

This PR gracefully handles error conditions in the PocketIC server by returning OpOut::Error (resulting in an HTTP response with an error status code and error message in the response body) instead of panicking in the server (resulting in a confusing error message in the PocketIC library due to abruptly interrupted HTTP connection after the server panicked).

michael-weigelt · 2024-11-12T09:11:23Z

I agree that http responses from panics are bad. But I am not sure OpOut is the correct layer for errors. Aren't we wrapping the OpOut in some ApiResponse anymore? Which could have an Error variant, so that the http response can be nice nevertheless?

mraszyk · 2024-11-12T11:30:26Z

A computation returns OpOut so error conditions during a computation can only be reported via OpOut at the moment. The graph also stores OpOut. It is indeed the case that OpOut is converted to a status code and ApiResponse and then also back, but I see the separation as follows:

OpOut contains the result of the computation independently of its HTTP representation;
the conversion to ApiResponse deals with JSON encoding and enables returning an OpOut::Error as an OK response (e.g., if we decide that getting stable memory of a non-existing canister should not panic in the library, then we could say that the corresponding OpOut::Error maps to an OK response of a new Raw type that we introduce).

Does that make sense or should I elaborate more on some parts?

michael-weigelt · 2024-11-12T13:15:18Z

error conditions during a computation

I think initially we aimed at making Ops infallible (and reject bad requests at parse time), but I am actually not sure that was ever realistic. But if possible, I would prefer that. Isn't it the case that all ingress messages will result in either WasmResult::Result or ::Reject, and we can always turn that into an OpOut? And all other Ops are System/Subnet-level operations, for which we could perhaps guarantee infallability upfront?

I think your new proposed strategy is viable too, but why switch the strategy now?

enables returning an OpOut::Error as an OK response

IIUC, this means we'd just use HTTP as a message protocol, and therefore don't provide a REST-API (where the state of the application/request is reflected in the http status code etc), which was the aim initially.

I am not sure how valuable adherence to REST is... but I could imagine that other implementers of libraries value this more than us internally.

Again what I question is not necessarily the proposition in the PR, but rather I want to point out that we used to have a strategy here and your proposal is a change in strategy.

I don't quite understand the example in parentheses. What does a library panic have to do with the server's return values?

mraszyk · 2024-11-12T13:27:43Z

I think initially we aimed at making Ops infallible (and reject bad requests at parse time), but I am actually not sure that was ever realistic.

This is not realistic because the PocketIC router cannot know if a canister exists (for instance).

Isn't it the case that all ingress messages will result in either WasmResult::Result or ::Reject, and we can always turn that into an OpOut?

No, they can also fail because the canister ID cannot be routed (the canister ID is valid, but there's no subnet to which the request could be routed).

And all other Ops are System/Subnet-level operations, for which we could perhaps guarantee infallability upfront?

This is not the case, e.g., setting time into the past is an error that you can only detect once you have access to the PocketIc instance (run a computation).

why switch the strategy now?

to me it doesn't feel like switching any strategy

where the state of the application/request is reflected in the http status code etc

it depends on the semantics of an operation, if you define that "get_stable_memory" must be called for a non-empty canister, then failing to do so should result in 4xx status code; if you define that "get_stable_memory" can be called for any canister and the structured return value (e.g., of some enumeration type) characterizes if the canister is empty, then we should get 200 even if the computation failed because the canister does not exist. Now you can encode this in OpOut (just like we do with OpOut::MaybeSubnetId), but it seems cleaner to return OpOut::Error and convert it into status code later (instead of having OpOut::MaybeA, OpOut::MaybeB, OpOut::MaybeC etc.)

What does a library panic have to do with the server's return values?

The library interprets 4xx as panic.

michael-weigelt · 2024-11-12T13:41:58Z

to me it doesn't feel like switching any strategy

Ok, you have sufficiently shown that the original aim is not easily achievable, so it doesn't matter much if we don't get any closer.

Now you can encode this in OpOut (just like we do with OpOut::MaybeSubnetId), but it seems cleaner to return OpOut::Error and convert it into status code later (instead of having OpOut::MaybeA, OpOut::MaybeB, OpOut::MaybeC etc.)

Agreed.

The library interprets 4xx as panic.

Seems a bit strange, because from RESTful perspective I'd exactly expect a 404 if I try to interact with a canister that does not exist. To interpret every 4xx as a "axum panicked" is not right. But this is a bit off-topic.

fix: gracefully handle error conditions in PocketIC server

667c5e8

mraszyk requested review from a team as code owners November 11, 2024 17:39

github-actions bot added the fix label Nov 11, 2024

Merge branch 'master' into mraszyk/pic-unwraps

9a41f93

github-actions bot added @ic-message-routing-owners @pocket-ic labels Nov 11, 2024

Merge branch 'master' into mraszyk/pic-unwraps

beaa6ad

michael-weigelt approved these changes Nov 12, 2024

View reviewed changes

mraszyk added this pull request to the merge queue Nov 12, 2024

Merged via the queue into master with commit bc2c84c Nov 12, 2024
24 checks passed

mraszyk deleted the mraszyk/pic-unwraps branch November 12, 2024 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: gracefully handle error conditions in PocketIC server #2556

fix: gracefully handle error conditions in PocketIC server #2556

mraszyk commented Nov 11, 2024

michael-weigelt commented Nov 12, 2024

mraszyk commented Nov 12, 2024

michael-weigelt commented Nov 12, 2024

mraszyk commented Nov 12, 2024

michael-weigelt commented Nov 12, 2024

fix: gracefully handle error conditions in PocketIC server #2556

fix: gracefully handle error conditions in PocketIC server #2556

Conversation

mraszyk commented Nov 11, 2024

michael-weigelt commented Nov 12, 2024

mraszyk commented Nov 12, 2024

michael-weigelt commented Nov 12, 2024

mraszyk commented Nov 12, 2024

michael-weigelt commented Nov 12, 2024