Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VM-based runtime configuration details. #405

Closed
wants to merge 4 commits into from

Conversation

jamesodhunt
Copy link

First pass, based on discussions on the mailing list:

https://groups.google.com/a/opencontainers.org/d/topic/dev/mM_DWZmXst0/discussion

Signed-off-by: James Hunt james.o.hunt@intel.com

config.md Outdated

## RuntimeDriver

RuntimeDriver is an optional driver name to be passed to VM-based runtimes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention of RuntimeDriver to select between Xen, KVM, Qemu and other hypervisors?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intention, yes. Since HypervisorPath is already specified, maybe this field is redundant as named. Maybe rather than RuntimeDriver, something like HypervisorPersonality would be more useful. I added RuntimeDriver specifically for the benefit of runv so would be good to get input from them as to whether HypervisorPath is sufficient.

@laijs
Copy link
Contributor

laijs commented May 5, 2016

LGTM

config.md Outdated
@@ -265,6 +265,19 @@ Annotations are key-value maps.
}
```

## ImagePath

ImagePath is an optional absolute path to the root filesystem image used by VM-based runtimes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

  • imagePath (string, optional) path to the root filesystem image used by VM-based runtimes.

Why do you currently require an absolute path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And docs around the need for this setting vs. root.path would be nice (cherry-pick from here?). I'm backing off of my earlier suggestion to put all of this information in root.path, because it would be awkward to combine a POSIX path-to-image with a Windows path-to-root-inside-the-image (if you were launching a Windows VM on from a POSIX system).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imagePath needs to be absolute since it will not form part of the bundle, so the runtime needs to know exactly where on the host to find it.

I've updated the PR with further details on imagePath as suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Wed, May 11, 2016 at 02:12:16AM -0700, James Hunt wrote:

+ImagePath is an optional absolute path to the root filesystem image used by VM-based runtimes.

imagePath needs to be absolute since it will not form part of the
bundle, so the runtime needs to know exactly where on the host to
find it.

With Linux-namespace containers, the runtime knows exactly where to
find root.path without it having to be absolute (relative paths are
interpreted relative to the directory containining config.json). So I
don't see a reason to require absolute paths. Then folks could put
the image in the bundle if they wanted, or point at an absolute path
if they wanted.

@jamesodhunt
Copy link
Author

@crosbymichael, @wking - how does this look?

config.md Outdated
* **`imagePath`** (string, required) path to file that represents the root filesystem for the virtual machine.
* **`kernel`** (object, required) specifies details of the kernel to boot the virtual machine with.

Note that `imagePath` refers to a path on the host (outside of the virtual machine). This field is distinct from the **`path`** field in the (#Root-Configuration) section since in the context of a virtual machine-based runtime:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be hyphen-happy, but I'm pretty sure a “machine-based” hyphen should also come with a “virtual-machine” hyphen. Whether “virtual-machine-based” or “virtual-machine based” is better is beyond me ;).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking
Copy link
Contributor

wking commented May 19, 2016

4c8a03c looks pretty good to me (I just have copy-edit suggestions
[1,2]). It would also be nice to get a ‘vm’ entry in here 3. And
with Solaris and Linux docs in separate config-{platform}.json files,
it may make sense to put the VM stuff in config-vm.json too.

@jamesodhunt
Copy link
Author

Hi @wking - I've pushed updates to all your comments except [3] - I know we discussed this on the mailing list, but I'm still slightly wary of having "platform.os" set as "vm" as that precludes the vm-based runtimes that use Linux from using any details from the "linux" object (I think?)

Also, using a real-world example, I believe docker/containerd would be required to know that a containers runtime was set to a vm-based one and therefore not generate a "linux" object. Or alternatively, something like containerd would need to in-flight modify config.json to:

  • remove the "linux" object.
  • change "platform.os" to vm.

This doesn't sound ideal, but maybe @crosbymichael could comment on this.

How about instead we add something like "platform.type" or "platform.variant"?

        "platform": {
                "os": "linux",
                "arch": "amd64",
                "variant": "vm"
        },

Where variant would be optional but if set a runtime could consume both the "os" and "variant" objects.

The only further thought I had was on the kernel parameters: it might be more elegant to encode them as an array, but more practical to specify them in string form.

[3] - https://github.com/opencontainers/runtime-spec/blob/d8d0bc6f601f98122f8908eb21158e32432ab563/config.md#platform-specific-configuration

@wking
Copy link
Contributor

wking commented May 20, 2016

On Fri, May 20, 2016 at 07:15:46AM -0700, James Hunt wrote:

Hi @wking - I've pushed updates to all your comments except [3]…

Thanks. And just a reminder that I'm not a maintainer, so please take
my comments with a grain of salt ;).

    "platform": {
            "os": "linux",
            "arch": "amd64",
            "variant": "vm"
    },

I'm not clear on how VM-based containers are intended to manage the
host/guest split 1. Are platform.os and platform.arch for the
guest? If so, how does a runtime know if the config is targetting the
right host OS? For example, hooks, imagePath, etc. all need to be in
the host-OS language. Instead of adding a variant field, I think VM
support calls for an explicit declaration like:

"platform": {
"host": {
"os": "linux",
"arch": "amd64",
},
"guest: {
"os": "linux",
"arch": "amd64",
}
}

Where runtimes that support VMs require ‘guest’ and look for a ‘vm’
section, runtimes that don't support VMs error if ‘guest’ is present,
and the validator requires a ‘vm’ section if and only if ‘guest’ is
present.

@wking
Copy link
Contributor

wking commented May 20, 2016

On Fri, May 20, 2016 at 11:25:28AM -0700, W. Trevor King wrote:

Are platform.os and platform.arch for the guest? If so, how does a
runtime know if the config is targetting the right host OS?

The other reasonable approach here is to punt platform information to
higher levels (it's already in the image spec:
opencontainers/image-spec#60). Then have the runtime just assume that
the config targets platforms it can handle. The platform fields seem
to be designed to let the runtime guess ahead of time whether it can
launch the given container, and since #418 we don't have to check
ahead of time (we can just try open the kernel path, for example, and
error out if that path doesn't exist because the config uses
forward-slash path separators but your host uses backslash path
separators). With this approach:

  • We drop ‘platform’ entirely.
  • VM-based runtimes read the ‘vm’ section, and error out if it's
    missing.
  • Non-VM runtimes ignore ‘vm’ even if it is present.
  • Linux(-guest)-based runtimes read a ‘linux’ section, ignoring
    ‘solaris’, etc. if they're present.

It seems unlikely that the ‘platform’ section ever covers the full
scope of what a container needs (e.g. was your kernel compiled with
the right settings? Where any of the bundled binaries compiled with
-march=native (or other extremely-specific arch setting)?). On the
other hand, ‘platform’ is a fairly reliable solution even if it isn't
perfect, and folks for whom it is not specific enough likely know what
they're up to.

So I'm fine keeping it (and splitting host/guest) or dropping it (and
punting to image-spec), but I don't think keeping it as it stands is
viable.

@crosbymichael
Copy link
Member

@jamesodhunt i don't think we need to change anything around populating the platform to VM.

LGTM

@wking
Copy link
Contributor

wking commented May 20, 2016

On Fri, May 20, 2016 at 02:16:25PM -0700, Michael Crosby wrote:

i don't think we need to change anything around populating the
platform to VM.

How do you see the existing structure being populated for the various
permutations of Linux/Windows host/guest? Or are you only planning to
support VM configs where the host and guest are the same platform?

config-vm.md Outdated
Note that `imagePath` refers to a path on the host (outside of the virtual machine).
This field is distinct from the **`path`** field in the (#Root-Configuration) section since in the context of a virtual-machine-based runtime:

* **`ImagePath`** will represent the root filesystem for the virtual machine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ImagePath/imagePath ?

@crosbymichael crosbymichael added this to the 1.0.0 milestone Jun 1, 2016
@philips
Copy link
Contributor

philips commented Jun 3, 2016

I am sorry, I don't understand why this configuration data must live in the runtime specification and isn't just part of the flags or configuration given to the runtime. These three parameters don't seem fundamental to me. In rkt when someone wants to run their container in a VM they just add a flag that changes the execution to a VM. They don't put in a path to a kernel, initrd, etc.

@philips
Copy link
Contributor

philips commented Jun 3, 2016

Rejected

Rejected with PullApprove

@crosbymichael crosbymichael removed this from the 1.0.0 milestone Jun 3, 2016
@hqhq
Copy link
Contributor

hqhq commented Aug 17, 2017

@opencontainers/runtime-spec-maintainers Can we starting reviewing this again? What's the major block here?

@warmchang
Copy link

"This branch has conflicts that must be resolved", should rebase and commit again.

@hqhq
Copy link
Contributor

hqhq commented Nov 6, 2017

After reading through this thread, the only blocker I see is to see some commonality for VMs, and we had a weekly meeting to discuss on March, according to the minutes: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2017/opencontainers.2017-03-15-21.02.log.html , @RobDolinMS have pinged some Windows folks to take a look, how's that going? @RobDolinMS

Also ping @opencontainers/runtime-spec-maintainers how's other maintainers think on this? I'd like to push this as it makes sense and is very useful for container ecosystem, it's been too long without any progress.

@sameo
Copy link

sameo commented Nov 6, 2017

@hqhq with CC 3.0 use a libcontainer based agent on the VM where we're basically able to apply most of the OCI runtime resource constraints to the container inside the VM.
We will rebase this PR very shortly and may add it one of the next OCI meetings agenda to re-discuss it.

@hqhq
Copy link
Contributor

hqhq commented Nov 8, 2017

@sameo Thanks for your response, after more thoughts, I have a question about this PR.

If we add an entry to OCI config, what should happen if we try to validate or run a non-vm-based runtime (such like runc) with a config file which has a vm entry in it? Should we say the config file is invalid? As we have definition in runtime.md

All of the properties configured in config.json except for process MUST be applied. process.args MUST NOT be applied until triggered by the start operation. The remaining process properties MAY be applied by this operation. If the runtime cannot apply a property as specified in the configuration, it MUST generate an error and a new container MUST NOT be created.

Or vise-versa, what if we validate or run a vm-based runtime with a config without a vm entry? Since this entry is optional, runtime should run a container without an optional property, right?

I kind of understand @philips 's concern on this, spec should contain properties those are directly related to the container processes configuration. But I also think the container word in Open Container Initiative should not be tied to the concept of Linux container (namespace + cgroup), all isolated environment can be called container, so container can be more than processes.

Also ping @liangchenye for runtime-tools 's perspective on this PR.

@liangchenye
Copy link
Member

The runtime-tools will have four tests:

case runtime config with vm expected result
1 runc yes yes
2 runc no yes
3 runv yes yes
4 runv no no

Just as @hqhq refers, it is inconsistent with current spec (case 1).
runc could/should not apply vm config, so maybe we should change the spec to:

If the runtime cannot apply a required property as specified in the configuration,
it MUST generate an error and a new container MUST NOT be created.

Another problem is it is not possible to determine if a runtime is vm-based (case 4).
So runtime-tools should category runtimes and provide different test cases for vm-based runtime and non-vm-based runtime.

@liangchenye
Copy link
Member

Also, we might need to change image spec to add a 'vm' config.
An image fetcher needs to pull the right 'vm' as well as pull with the right arch and os.

@sameo
Copy link

sameo commented Nov 8, 2017

@hqhq

Or vise-versa, what if we validate or run a vm-based runtime with a config without a vm entry? Since this entry is optional, runtime should run a container without an optional property, right?

The VM section is optional so non VM based runtimes should simply ignore it. The same way I assume runc today is capable of creating and running a container based on a config.json even if that file contains both a Windows and Linux sections. It simply ignores the Windows one.

And VM based runtimes must support configs without a VM section today and will have to do so in the future, as it's an optional section.

I kind of understand @philips 's concern on this, spec should contain properties those are directly related to the container processes configuration. But I also think the container word in Open Container Initiative should not be tied to the concept of Linux container (namespace + cgroup), all isolated environment can be called container, so container can be more than processes.

Yes, I fully agree. A VM is one addtional, optional isolation layer for containers. And it defines how the process could run.
There's already another example for this in the current spec, in the Windows section:

	// HyperV contains information for running a container with Hyper-V isolation.
	HyperV *WindowsHyperV `json:"hyperv,omitempty"`

So it seems hardware virtualization is already an accepted isolation mechanism for containers.
BTW, Windows/MSFT folks should be involved in that discussion as the VM section we're trying to add would basically supersede the current WindowsHyperV section for Windows containers.

@sameo
Copy link

sameo commented Nov 8, 2017

@liangchenye

Also, we might need to change image spec to add a 'vm' config.
An image fetcher needs to pull the right 'vm' as well as pull with the right arch and os.

I would disagree with that. The container image is orthogonal to the VM it's going to run into, there should not be any link between a container image and a VM definition.

@sameo
Copy link

sameo commented Nov 8, 2017

@liangchenye

Another problem is it is not possible to determine if a runtime is vm-based (case 4).

Pretty much, yes. So I think case 4 should be positive, i.e. runv/clearcontainers should continue supporting config files without a VM section. The same way they support it today.

@liangchenye
Copy link
Member

i.e. runv/clearcontainers should continue supporting config files without a VM section. The same way they support it today.

It means runc/cc has and will use its default VM section, right?

I would disagree with that. The container image is orthogonal to the VM it's going to run into, there should not be any link between a container image and a VM definition.

Images follows this flow: ocihub -- fetcher--- scheduler ---- runtime.
So the scenario will be like this?

  1. deploy a scheduler
  2. add a runtime to the scheduler
  3. tell the scheduler the runtime's VM status(if it is vm based)
  4. the scheduler uses a fetcher to get an image from ocihub
  5. scheduler add the VM section
  6. runtime runs the bundle with VM secion

hqhq added a commit to hqhq/runtime-spec that referenced this pull request Nov 9, 2017
Address the issue mentioned in:
opencontainers#405 (comment)

We already doing this in runc as we can run a container
with a config which has `windows` entry. And this is
the right way to handle it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
hqhq added a commit to hqhq/runtime-spec that referenced this pull request Nov 9, 2017
Address the issue mentioned in:
opencontainers#405 (comment)

We already doing this in runc as we can run a container
with a config which has `windows` entry. And this is
the right way to handle it.

Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
@sameo
Copy link

sameo commented Nov 9, 2017

@liangchenye

It means runc/cc has and will use its default VM section, right?

We don't append a VM section in the OCI config file, but we use configuration files for setting default kernel and guest image paths.

  • deploy a scheduler
  • add a runtime to the scheduler
  • tell the scheduler the runtime's VM status(if it is vm based)

Not sure you need to do this. From e.g. kubernetes, you could add specific annotations to your pod to specify a kernel and image path you want to use for this specific pod. The scheduler/orchestrator does not really need to know that the runtime is VM based, it will at some point add a VM section to the OCI config file based on the pod/workload metadata it received.

  • the scheduler uses a fetcher to get an image from ocihub
  • scheduler add the VM section

Yes, although I would replace scheduler with "one of the scheduler's component".

  • runtime runs the bundle with VM secion

Yes.

Add configuration details for virtual-machine-based runtimes.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
All paths specified in the configuration for virtual-machine-based
runtimes are host-side.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
@cyphar
Copy link
Member

cyphar commented Dec 8, 2017

Has the plan for this changed since the announcement of https://katacontainers.io/ ? Personally I think Kata should inform the spec work.

@vbatts
Copy link
Member

vbatts commented Apr 4, 2018

#949 was merged.

@vbatts vbatts closed this Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.