Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix double panic lockup in clients panic handler #1433

Merged
merged 5 commits into from
May 24, 2022
Merged

Conversation

raphCode
Copy link
Contributor

When the pty the client was running in disappears, reading from stdin
causes a panic, which triggers the custom panic handler. This handler
attempts to print a backtrace to the terminal and tries to unset the raw
mode for that. Since the pty has already disappeared, the tcsetattr call
fails and causes a second panic, which locks everything up.

This commit fixes this by returning an Result from the unset_raw_mode
function, allowing the calling panic handler to handle any error
gracefully.
Since we are now aware of the fact that panics may happen / are handled
after the pty has disappeared, logging them to file seems useful: there
is no other other place to show them to the user.
@raphCode raphCode temporarily deployed to cachix May 23, 2022 21:46 Inactive
@raphCode raphCode temporarily deployed to cachix May 23, 2022 21:46 Inactive
@raphCode raphCode temporarily deployed to cachix May 23, 2022 21:47 Inactive
@raphCode raphCode temporarily deployed to cachix May 23, 2022 21:47 Inactive
@raphCode
Copy link
Contributor Author

@jaeheonji in case you want to have a look, I solved the problem by returning a Result from unset_raw_mode(), which must be handled at callsites. I unwrapped it everywhere except in the panic handler.
I also log every panic to the logfile now, since we now know we can't rely on printing to the terminal.

@raphCode raphCode temporarily deployed to cachix May 23, 2022 22:08 Inactive
@raphCode raphCode temporarily deployed to cachix May 23, 2022 22:08 Inactive
Copy link
Member

@jaeheonji jaeheonji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT. This can prevent panic from falling into a loop.
Of course, perhaps, the output for errors in unset_raw_mode can be unkind, but this would be a very rare case. And I don't think it's a big deal.

@raphCode raphCode temporarily deployed to cachix May 24, 2022 07:11 Inactive
@raphCode raphCode temporarily deployed to cachix May 24, 2022 07:11 Inactive
@raphCode
Copy link
Contributor Author

perhaps, the output for errors in unset_raw_mode can be unkind

What do you mean? Ugly to handle for the programmer?

@jaeheonji
Copy link
Member

jaeheonji commented May 24, 2022

perhaps, the output for errors in unset_raw_mode can be unkind

What do you mean? Ugly to handle for the programmer?

Almost the same, maybe compared to the handle_panic, I thought there would be some lack of information, and no pretty output (If unset_raw_mode returns an error.). However, as I have already mentioned, I do not think it is such a big problem. What do you think?

@raphCode
Copy link
Contributor Author

Ah, you mean the logging of errors in the logfile?
I think it is okay, it looks like this:

ERROR  |zellij_client            | 2022-05-23 23:37:53.927 [stdin_handler] [zellij-client/src/lib.rs:205]: Panic occured in client:
PanicInfo { payload: Any { .. }, message: Some(called `Result::unwrap()` on an `Err` value: Os { code: 5, kind: Uncategorized, message: "Input/output error" }), location: Location { file: "zellij-client/src/os_input_output.rs", line: 122, col: 39 } }

It may not be pretty, but I think it includes all necessary information to analyze the issue.

@jaeheonji
Copy link
Member

Cool! I agree :)

@raphCode raphCode merged commit eab464b into main May 24, 2022
@har7an har7an deleted the fix_double_panic branch October 23, 2022 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Zellij does not exit on reboot/shutdown Client process lingers around after ssh disconnect
2 participants