Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System Health Monitoring Deamon #2134

Open
rootkovska opened this issue Jun 30, 2016 · 5 comments
Open

System Health Monitoring Deamon #2134

rootkovska opened this issue Jun 30, 2016 · 5 comments
Labels
C: core C: mgmt C: Xen P: major Priority: major. Between "default" and "critical" in severity. T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality.

Comments

@rootkovska
Copy link
Member

As part of the effort to "hide as much Qubes infrastructure from the user as possible" that we would like to embrace for the upcoming Qubes 4.x, we will need a global system health monitoring daemon. This is necessary because system VMs, such as e.g. net/USB-holding VMs do crash from time to time. If we don't want the user to be concerned with such system VMs, we need to automatically be able to detect their crash (easy via qrexec service from Dom0) and restart automatically (currently not so easy due to difficulties with reconnecting Xen net front/backend).

@rootkovska rootkovska added C: core C: Xen P: major Priority: major. Between "default" and "critical" in severity. C: mgmt labels Jun 30, 2016
@rootkovska rootkovska added this to the Release 4.0 milestone Jun 30, 2016
@rootkovska
Copy link
Member Author

This should include the (backend) functionality of what is described in #6.

@ideologysec
Copy link

Are there any plans to simply the existing sys-net/firewall/etc domains with unikernels?

I know it was discussed briefly a few months back and might be a bit off-topic, but unikernels boot in far less time than a whole Linux system, and might mesh with a System Health Monitor (what Minix refers to as a "resurrection server") rather well.

@ideologysec
Copy link

ideologysec commented Apr 1, 2017

Looks like the Mirage net/fw-vm was adopted for the GSoC, which is awesome.

Would something like Monit be adaptable or work as a System Monitoring Daemon?

@rootkovska
Copy link
Member Author

Most likely not. We don't want to introduce a centralized daemon which exposes large attack surface for any VM to attack it in a dozen of ways, and then attack other VMs from it.

@Rudd-O
Copy link

Rudd-O commented Mar 28, 2018

I'm imagining something like a Prometheus exporter running on the VM, listening on a local UNIX socket, and made accessible from the Manager VM via a qrexec service. Then a Prometheus master can be made to collect this info (with minimal retention) and queries against it can then be used to act upon. This would also mesh quite well with, say, running the node exporter on the VMs as well.

Exporters can be written that use minimal amounts of RAM, so they would be virtually cost-free.

@andrewdavidwong andrewdavidwong added the T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality. label Mar 31, 2018
@andrewdavidwong andrewdavidwong removed this from the Release 4.2 milestone Aug 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core C: mgmt C: Xen P: major Priority: major. Between "default" and "critical" in severity. T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality.
Projects
None yet
Development

No branches or pull requests

4 participants