-
Notifications
You must be signed in to change notification settings - Fork 101
Add support for userspace IOAPIC #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9eb8f7e
to
2594125
Compare
From https://www.github.com/cloud-hypervisor/cloud-hypervisor: The kvm API has many structs that resemble the following `Foo` structure: ``` struct Foo { some_data: u32 entries: __IncompleteArrayField<__u32>, } ``` In order to allocate such a structure, `size_of::<Foo>()` would be too small because it would not include any space for `entries`. To make the allocation large enough while still being aligned for `Foo`, a `Vec<Foo>` is created. Only the first element of `Vec<Foo>` would actually be used as a `Foo`. The remaining memory in the `Vec<Foo>` is for `entries`, which must be contiguous with `Foo`. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
When enabling `KVM_CAP_SPLIT_IRQCHIP`, the LAPIC will be created by the guest kernel. However, the VMM needs to create an IOAPIC in userspace to redirect interrupts from the system bus to the respective LAPIC. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
When an interrupt is edge triggered, the Remote IRR bit needs to be set to 0. `fix_edge_remote_irr` provides an easy way to set that bit. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
When triggering an interrupt in the kernel or commiting the IRQ routes, the `IoApic` struct needs to access the VM's FD. Provide a function to send a message to the IRQ worker thread with the specified message. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Each entry in the ioredtbl is a 64-bit bitfield. Make it easier to access the information in the entry by parsing it into a struct and determine the MSI information that can be gathered from the entry. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Add a function to update the MSI data for each IRQ route if it's not masked. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Add function to prepare the entry to be serviced and send message to the IRQ worker thread. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Implement the `BusDevice` trait for `IoApic` to handle reads and writes from the system bus. When reading the `IO_WIN` register, it's important to verify the data is only 32-bits in size, as the register is 64-bits, but needs to do two consecutive reads. Like `read`, `write` needs to verify the data is 32-bits so the register can be read twice and get the full 64-bits of data. In `write`, the IO APIC can't write to the Version or Arbitration regiseters, because those are Read-Only registers on the device. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Provide user API, `krun_split_irqhip`, to specify the VM should have `KVM_CAP_SPLIT_IRQCHIP` enabled. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
If the user called `krun_split_irqchip` with `enabled` set to `true`, start an IRQ worker thread to aid in servicing interrupts or commiting IRQ routes. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
If the user calls `krun_split_irqchip` with `enabled` set to `true`, then create an `IoApic` device in userspace rather than creating an IOAPIC in the guest with KVM. Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
Are there any Copyright requirements if I used the QEMU C implementation for |
I don't believe so. You could make a note such as "inspired by QEMU's bus device implementation" or something of the sort, if you'd like. |
@@ -1521,6 +1529,54 @@ pub extern "C" fn krun_start_enter(ctx_id: u32) -> i32 { | |||
.unwrap(); | |||
} | |||
|
|||
#[cfg(target_arch = "x86_64")] | |||
if ctx_cfg.vmr.split_irqchip { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move this and the GPU thread to some helper method in Vmm
, but let's do that in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I was planning on doing it once this got merged, like you suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guest_memfd mappings should be moved to this thread as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work here. +1
Allow the user to specify if they want to split the IRQCHIP such that the LAPIC is still inside the guest, but the IOAPIC is in userspace and managed by the VMM.
Intel Trust Domain Extensions, for example, require this functionality to work.
The Intel specification for the device can be found here: https://pdos.csail.mit.edu/6.828/2016/readings/ia32/ioapic.pdf