Skip to content

Clarify uses of kernel_start parameter #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 24, 2020

Conversation

aljimenezb
Copy link
Contributor

As part of the discussion in #12, we have noticed that the kernel_start parameter has different usages in the implementations for PE, Elf, and BzImage. The Elf implementation is specially confusing, since kernel_start is used to specify an offset to add to the default load address of the kernel. The discussion is ongoing to determine whether this has any real use cases or provides any security benefits.

In the meantime, I propose to rename the parameter to kernel_offset, which might be more appropriate to describe its usage by the implementers of KernelLoader.

Also, since PVH code expects the kernel to be loaded at its default load address, it cannot be used to boot guests if a non-default load address is requested by using the kernel_offset argument. If kernel_offset is requested, we simply avoid looking for a PVH entry point.

Resolves #12

@aljimenezb aljimenezb force-pushed the kernel-start-fixes branch from 82fcaeb to c814ca0 Compare June 10, 2020 17:14
alxiord
alxiord previously approved these changes Jun 10, 2020
// build by the value of CONFIG_PHYSICAL_START). Therefore, only attempt
// to use PVH if an offset from the default load address has not been
// requested using the kernel_offset parameter.
if phdr.p_type == elf::PT_NOTE && kernel_offset.is_none() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling this parameter like so makes for implicit decisions. I find it better in code to let the user have explicit decisions.
With this approach it is easy to mis-use the function and not understand what you did wrong when it happens.

The implicit decision here is:

  • if you pass kernel_offset than this must also mean that you don't want PVH even though p_type is PVH

An explicit decision is:

  • p_type is PVH => you want to parse PVH

To keep the decision explicit what we can do instead is to ignore kernel_offset for PVH. I looks a bit like this is what was happening before. We can be explicit in the documentation as well and say that for PVH kernel_offset is ignored.

What do you say?

Copy link
Contributor Author

@aljimenezb aljimenezb Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point of letting the user make the decisions, but I am not really sure what is the best way to do it in this case. I'm on record saying that kernel_offset does not have any real use cases for the Elf implementation and it should be an unused parameter, but if we still want to keep it around then it means that PVH can not be returned as an option. The VMM "knows" it is not using PVH, since the return type from the load() method has a pvh_entry_addr field that is None, but it does not know why it can't.

We can't exactly implement what you suggest above, so let me try to clarify how the PVH boot selection works, and we might be able to come up with a better idea.

The short explanation is that the VMM can not request when it calls loader::elf::Elf::load() that PVH is used; it does not know yet if it is available. The availability of PVH is an extra choice, a "bonus feature", that is provided alongside the default boot protocol. But the VMM only knows if it is available after load() returns. I didn't provide a way for VMMs to request exclusively that PVH is available or otherwise fail, since that decision can be made at the VMM level.

We cannot ignore the kernel_offset request from the VMM and still provide the two options for the VMM to decide, since kernel_offset affects how the kernel is loaded in memory. So I opted for honoring the request for kernel_offset which means using Linux boot protocol. I will add a comment on the parameter description that mentions the effect it has on PVH (it disables it). I didn't want to declare the kernel_offset/kernel_start parameter unused in the Elf case, since I can't prove 100% that is not needed.

Hopefully that made sense. Sorry for the long wall of text :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more details on how the code works, mostly expanding on the previous comment. Feel free to ignore:

The VMM cannot directly request PVH boot, i.e. there is not "p_type is PVH" selection by the VMM. It simply requests to load a kernel that is in the ELF format by calling the loader::elf::Elf::load() method.

The ELF header has a field (e_entry) which holds the default entry point for the executable. i.e. this is the virtual address that is set in %rip by the VMM configuration methods, and the guest will start executing code at this address when it is launched. Before PVH support, this value in e_entry was the only entry point that could be used to boot a guest with an ELF (vmlinux) kernel. I'll call this "Linux boot", and this is the way it currently works in Firecracker:
https://github.com/firecracker-microvm/firecracker/blob/32c5bc3360cb3ee4d836faaa580e5fbfb314ef66/src/kernel/src/loader/mod.rs#L145

The PVH code changes that were merged earlier do not replace the mechanism above, they just add an alternative method, so the VMM now has two choices. Besides returning the default entry point contained in e_entry, the load() method now also returns to the VMM an optional field called pvh_entry_addr, which if present, tells the VMM that there is another address in the kernel where it can start executing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if this is too pedantic, but I think we should actually write the pvh_entry_addr field of the result if kernel_offset is present, but its value is 0.

Another thought was to use an enum with three variants for pvh_entry_addr (i.e. present/none/ignored) instead of just an Option, but I couldn't think of a scenario where we'd actively look for the distinction, so that probably doesn't add that much value (might be useful for debugging though). WDYT?

Copy link
Contributor Author

@aljimenezb aljimenezb Jun 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if this is too pedantic, but I think we should actually write the pvh_entry_addr field of the result if kernel_offset is present, but its value is 0.

That is a good point that I didn't considered initially. But it seems to me that it contradicts the assertion that if kernel_offset is requested, then we do not consider PVH at all. We would be adding a special case in the documented behavior with very little advantage.
That is my current take on it, but if you think that it has any use cases or makes things simpler from the VMM side let me know and I'll make the change.

Another thought was to use an enum with three variants for pvh_entry_addr (i.e. present/none/ignored) instead of just an Option, but I couldn't think of a scenario where we'd actively look for the distinction, so that probably doesn't add that much value (might be useful for debugging though). WDYT?

When adding PVH support I was trying to make minimal changes, and also didn't foresee that kernel_start would be a problem, so an Option was enough.

The downsides I see with using the enum are:

  • It would be a breaking change for VMMs (e.g. changes are needed in Cloud Hypervisor and rust-hypervisor-firmware)
  • The ignored variant would not be needed if there are future Linux kernel changes to allow PVH to work with kernel_offset

Again, I would have to defer to you and others on the discussion more experienced with the VMM implementations as to whether having an enum would make things better for the current main consumer (ICH) and future ones (Firecracker). I would opt to keep it simple and keep using the Option type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note on breaking changes: this crate is not even published yet, so I would like to make sure we have the appropriate interface before publishing it. I don't think we have any contract that we will not break the interface before (or even after) publishing. We are working with 0.x.x versions, so we aren't advertising it as stable.

@sameo @rbradford @sboeuf any thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. Regarding the different behavior for kernel_offset == 0, I was thinking that maybe the value of the offset is specified programmatically in some cases, and when it's 0 the logic can detect that PVH boot is also available as an option and maybe use it. I guess it's just a synthetic thought experiment at this point, so the way things are right now in the PR is also fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point above, but I think the interface is simpler to document and understand for the users of the crate when:
kernel_offset.is_none() ==> PVH is not available
kernel_offset.is_some() ==> PVH is available
This is predictable and clearly defined. If we add a special case for kernel_offset == 0, that blurs the boundaries between those cases. A user of the crate now needs implementation specific knowledge to understand why kernel_offset == 0 behaves differently.

If the user wants to look into the implementation details then it should be clear enough to understand that kernel_offset = Some(0) is the same as kernel_offset = None (when it comes to controlling where the kernel is loaded), but you can use PVH with the second variant.

So I'll keep the current approach, since you also mentioned above that you are fine with it. This is also a short change that could be added quickly if the need arises later.

// build by the value of CONFIG_PHYSICAL_START). Therefore, only attempt
// to use PVH if an offset from the default load address has not been
// requested using the kernel_offset parameter.
if phdr.p_type == elf::PT_NOTE && kernel_offset.is_none() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if this is too pedantic, but I think we should actually write the pvh_entry_addr field of the result if kernel_offset is present, but its value is 0.

Another thought was to use an enum with three variants for pvh_entry_addr (i.e. present/none/ignored) instead of just an Option, but I couldn't think of a scenario where we'd actively look for the distinction, so that probably doesn't add that much value (might be useful for debugging though). WDYT?

alxiord
alxiord previously approved these changes Jun 15, 2020
@aljimenezb aljimenezb force-pushed the kernel-start-fixes branch 7 times, most recently from 2b586ed to 6ce416e Compare June 16, 2020 19:09
@aljimenezb
Copy link
Contributor Author

aljimenezb commented Jun 16, 2020

Added changes implementing @alexandruag idea of using an enum for the PVH field in KernelLoaderResult. That is still pending feedback from @sameo @rbradford @sboeuf so I can revert it depending on their comments.

sboeuf
sboeuf previously approved these changes Jun 16, 2020
Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Nice work @aljimenezb

@aljimenezb aljimenezb force-pushed the kernel-start-fixes branch 4 times, most recently from b1246be to 95e4394 Compare June 16, 2020 22:38
Copy link
Contributor

@alexandruag alexandruag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Can you please add something like Related to https://github.com/rust-vmm/linux-loader/issues/12 to the message body of each commit? This is a guideline we're trying to implement right now to help with keeping track of things.

@@ -121,7 +154,9 @@ impl Elf {
impl KernelLoader for Elf {
/// Loads a kernel from a vmlinux elf image into guest memory.
///
/// The kernel is loaded into guest memory at offset `phdr.p_paddr` specified by the elf image.
/// By default, the kernel is loaded into guest memory at offset `phdr.p_paddr` specified by the elf image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style comment: can you please reformat these comment lines to keep them under 100 chars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks for the heads up. For some reason my editor (VSCode) does not warn me about those.

alxiord
alxiord previously approved these changes Jun 17, 2020
The load() method in the KernelLoader trait takes a kernel_start argument
that has different behavior in the Elf, BzImage, and PE implementations.

Elf: kernel_start provides an offset to be added to the default kernel load
address encoded in the phdr.p_paddr field of the Elf header. A VMM using
kernel_start requests that the kernel is loaded in guest memory at
phdr.p_paddr + kernel_start.

BzImage: kernel_start allows to specify an absolute address at which to load
the kernel in guest memory, replacing the default address taken from the
code32_start field of the bzImage header.

PE: kernel_start specifies a 2MB-aligned base address in guest memory.
The kernel must be loaded at kernel_start + text_offset, where text_offset
value is read from the Image header.

Change the name from kernel_start to kernel_offset, and change the
argument description on each implementation to clarify its use.

Related to rust-vmm#12

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
According to https://www.kernel.org/doc/Documentation/arm64/booting.txt:

"The Image must be placed text_offset bytes from a 2MB aligned base
address anywhere in usable system RAM and called there."

For the aarch64 case, the kernel_offset parameter allows a VMM to
specify the desired base address at which to load the image in
guest memory, while text_offset is contained in the Image header.

Ensure that the base address requested by the VMM meets the 2 MB
alignment requirement.

Related to rust-vmm#12

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Refactor the PVH code to return a field in KernelLoaderResult
that specifies whether the PVH capability is present or not,
or if it is being purposedly ignored.

The only current reason for ignoring the PVH capability is when
a kernel_offset is requested. Since PVH boot protocol expects
the kernel to be loaded at the default load address, it cannot
be used when a specific load offset has been requested by the
VMM using the kernel_offset parameter.

Related to rust-vmm#12

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
@aljimenezb
Copy link
Contributor Author

Looks good! Can you please add something like Related to https://github.com/rust-vmm/linux-loader/issues/12 to the message body of each commit? This is a guideline we're trying to implement right now to help with keeping track of things.

Done. Added it right before the Signed-off line.

@alxiord alxiord merged commit ec930d7 into rust-vmm:master Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting load address doesn't work for PVH loader
5 participants