You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on our discussion in-person yesterday it seems necessary to
separate the concept of runtime configuration from application
configuration. There are a few motivators:
- To support runtime updates of things like cgroups, rlimits, etc we
should separate things that are inherently runtime specific from
things that are static to the application running in the container.
- To support the goal of being able to move a bundle between hosts we
should make it clear what parts of the spec are and are not portable
between hosts so that upon landing on a new host the non-portable
options may be rewritten or removed.
- In order to attach a cryptographic identity to a bundle we must not
include details in the bundle that are host specific.
Copy file name to clipboardExpand all lines: bundle.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,19 +12,19 @@ A standard container bundle is made of the following 3 parts:
12
12
13
13
# Directory layout
14
14
15
-
A Standard Container bundle is a directory containing all the content needed to load and run a container. This includes its configuration file (`config.json`) and content directories. The main property of this directory layout is that it can be moved as a unit to another machine and run the same container.
15
+
A Standard Container bundle is a directory containing all the content needed to load and run a container.
16
+
This includes two configuration files `config.json` and `runtime.json`, and a rootfs directory.
17
+
The `config.json` file contains settings that are host independent and application specific such as security permissions, environment variables and arguments.
18
+
The `runtime.json` file contains settings that are host specific such as memory limits, local device access and mount points.
19
+
The goal is that the bundle can be moved as a unit to another machine and run the same application if `runtime.json` is removed or reconfigured.
16
20
17
21
The syntax and semantics for `config.json` are described in [this specification](config.md).
18
22
19
-
One or more *content directories* may be adjacent to the configuration file. This must include at least the root filesystem (referenced in the configuration file by the *root* field) and may include other related content (signatures, other configs, etc.). The interpretation of these resources is specified in the configuration. The names of the directories may be arbitrary, but users should consider using conventional names as in the example below.
23
+
A single `rootfs` directory MUST be in the same directory as the `config.json`.
24
+
The names of the directories may be arbitrary, but users should consider using conventional names as in the example below.
Copy file name to clipboardExpand all lines: config-linux.md
+9-203Lines changed: 9 additions & 203 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,142 +5,7 @@ cgroups, capabilities, LSM, and file system jails to fulfill the spec.
5
5
Additional information is needed for Linux over the [default spec configuration](config.md)
6
6
in order to configure these various kernel features.
7
7
8
-
## Linux namespaces
9
-
10
-
A namespace wraps a global system resource in an abstraction that makes it
11
-
appear to the processes within the namespace that they have their own isolated
12
-
instance of the global resource. Changes to the global resource are visible to
13
-
other processes that are members of the namespace, but are invisible to other
14
-
processes. For more information, see [the man page](http://man7.org/linux/man-pages/man7/namespaces.7.html)
15
-
16
-
Namespaces are specified in the spec as an array of entries. Each entry has a
17
-
type field with possible values described below and an optional path element.
18
-
If a path is specified, that particular file is used to join that type of namespace.
19
-
20
-
```json
21
-
"namespaces": [
22
-
{
23
-
"type": "pid",
24
-
"path": "/proc/1234/ns/pid"
25
-
},
26
-
{
27
-
"type": "net",
28
-
"path": "/var/run/netns/neta"
29
-
},
30
-
{
31
-
"type": "mnt",
32
-
},
33
-
{
34
-
"type": "ipc",
35
-
},
36
-
{
37
-
"type": "uts",
38
-
},
39
-
{
40
-
"type": "user",
41
-
},
42
-
]
43
-
```
44
-
45
-
#### Namespace types
46
-
47
-
***pid** processes inside the container will only be able to see other processes inside the same container.
48
-
***network** the container will have its own network stack.
49
-
***mnt** the container will have an isolated mount table.
50
-
***ipc** processes inside the container will only be able to communicate to other processes inside the same
51
-
container via system level IPC.
52
-
***uts** the container will be able to have its own hostname and domain name.
53
-
***user** the container will be able to remap user and group IDs from the host to local users and groups
54
-
within the container.
55
-
56
-
### Access to devices
57
-
58
-
Devices is an array specifying the list of devices to be created in the container.
59
-
Next parameters can be specified:
60
-
61
-
* type - type of device: 'c', 'b', 'u' or 'p'. More info in `man mknod`
62
-
* path - full path to device inside container
63
-
* major, minor - major, minor numbers for device. More info in `man mknod`.
64
-
There is special value: `-1`, which means `*` for `device`
65
-
cgroup setup.
66
-
* permissions - cgroup permissions for device. A composition of 'r'
67
-
(read), 'w' (write), and 'm' (mknod).
68
-
* fileMode - file mode for device file
69
-
* uid - uid of device owner
70
-
* gid - gid of device owner
71
-
72
-
```json
73
-
"devices": [
74
-
{
75
-
"path": "/dev/random",
76
-
"type": "c",
77
-
"major": 1,
78
-
"minor": 8,
79
-
"permissions": "rwm",
80
-
"fileMode": 0666,
81
-
"uid": 0,
82
-
"gid": 0
83
-
},
84
-
{
85
-
"path": "/dev/urandom",
86
-
"type": "c",
87
-
"major": 1,
88
-
"minor": 9,
89
-
"permissions": "rwm",
90
-
"fileMode": 0666,
91
-
"uid": 0,
92
-
"gid": 0
93
-
},
94
-
{
95
-
"path": "/dev/null",
96
-
"type": "c",
97
-
"major": 1,
98
-
"minor": 3,
99
-
"permissions": "rwm",
100
-
"fileMode": 0666,
101
-
"uid": 0,
102
-
"gid": 0
103
-
},
104
-
{
105
-
"path": "/dev/zero",
106
-
"type": "c",
107
-
"major": 1,
108
-
"minor": 5,
109
-
"permissions": "rwm",
110
-
"fileMode": 0666,
111
-
"uid": 0,
112
-
"gid": 0
113
-
},
114
-
{
115
-
"path": "/dev/tty",
116
-
"type": "c",
117
-
"major": 5,
118
-
"minor": 0,
119
-
"permissions": "rwm",
120
-
"fileMode": 0666,
121
-
"uid": 0,
122
-
"gid": 0
123
-
},
124
-
{
125
-
"path": "/dev/full",
126
-
"type": "c",
127
-
"major": 1,
128
-
"minor": 7,
129
-
"permissions": "rwm",
130
-
"fileMode": 0666,
131
-
"uid": 0,
132
-
"gid": 0
133
-
}
134
-
]
135
-
```
136
-
137
-
## Linux control groups
138
-
139
-
Also known as cgroups, they are used to restrict resource usage for a container and handle
140
-
device access. cgroups provide controls to restrict cpu, memory, IO, and network for
141
-
the container. For more information, see the [kernel cgroups documentation](https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt)
142
-
143
-
## Linux capabilities
8
+
## Capabilities
144
9
145
10
Capabilities is an array that specifies Linux capabilities that can be provided to the process
146
11
inside the container. Valid values are the string after `CAP_` for capabilities defined
@@ -154,33 +19,15 @@ in [the man page](http://man7.org/linux/man-pages/man7/capabilities.7.html)
154
19
]
155
20
```
156
21
157
-
## Linux sysctl
158
-
159
-
sysctl allows kernel parameters to be modified at runtime for the container.
160
-
For more information, see [the man page](http://man7.org/linux/man-pages/man8/sysctl.8.html)
161
-
162
-
```json
163
-
"sysctl": {
164
-
"net.ipv4.ip_forward": "1",
165
-
"net.core.somaxconn": "256"
166
-
}
167
-
```
22
+
## Rootfs Mount Propagation
168
23
169
-
## Linux rlimits
24
+
rootfsPropagation sets the rootfs's mount propagation. Its value is either slave, private, or shared. [The kernel doc](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt) has more information about mount propagation.
170
25
171
26
```json
172
-
"rlimits": [
173
-
{
174
-
"type": "RLIMIT_NPROC",
175
-
"soft": 1024,
176
-
"hard": 102400
177
-
}
178
-
]
27
+
"rootfsPropagation": "slave",
179
28
```
180
29
181
-
rlimits allow setting resource limits. The type is from the values defined in [the man page](http://man7.org/linux/man-pages/man2/setrlimit.2.html). The kernel enforces the soft limit for a resource while the hard limit acts as a ceiling for that value that could be set by an unprivileged process.
182
-
183
-
## Linux user namespace mappings
30
+
## User namespace mappings
184
31
185
32
```json
186
33
"uidMappings": [
@@ -199,48 +46,7 @@ rlimits allow setting resource limits. The type is from the values defined in [t
199
46
]
200
47
```
201
48
202
-
uid/gid mappings describe the user namespace mappings from the host to the container. *hostID* is the starting uid/gid on the host to be mapped to *containerID* which is the starting uid/gid in the container and *size* refers to the number of ids to be mapped. The Linux kernel has a limit of 5 such mappings that can be specified.
203
-
204
-
## Rootfs Mount Propagation
205
-
rootfsPropagation sets the rootfs's mount propagation. Its value is either slave, private, or shared. [The kernel doc](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt) has more information about mount propagation.
206
-
207
-
```json
208
-
"rootfsPropagation": "slave",
209
-
```
210
-
211
-
## Selinux process label
212
-
213
-
Selinux process label specifies the label with which the processes in a container are run.
214
-
For more information about SELinux, see [Selinux documentation](http://selinuxproject.org/page/Main_Page)
Apparmor profile specifies the name of the apparmor profile that will be used for the container.
222
-
For more information about Apparmor, see [Apparmor documentation](https://wiki.ubuntu.com/AppArmor)
223
-
224
-
```json
225
-
"apparmorProfile": "acme_secure_profile"
226
-
```
227
-
228
-
## Seccomp
229
-
230
-
Seccomp provides application sandboxing mechanism in the Linux kernel.
231
-
Seccomp configuration allows one to configure actions to take for matched syscalls and furthermore also allows
232
-
matching on values passed as arguments to syscalls.
233
-
For more information about Seccomp, see [Seccomp kernel documentation](https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt)
234
-
The actions and operators are strings that match the definitions in seccomp.h from [libseccomp](https://github.com/seccomp/libseccomp) and are translated to corresponding values.
235
-
236
-
```json
237
-
"seccomp": {
238
-
"defaultAction": "SCMP_ACT_ALLOW",
239
-
"syscalls": [
240
-
{
241
-
"name": "getcwd",
242
-
"action": "SCMP_ACT_ERRNO"
243
-
}
244
-
]
245
-
}
246
-
```
49
+
uid/gid mappings describe the user namespace mappings from the host to the container.
50
+
The mappings represent how the bundle `rootfs` expects the user namespace to be setup and the runtime SHOULD NOT modify the permissions on the rootfs to realize the mapping.
51
+
*hostID* is the starting uid/gid on the host to be mapped to *containerID* which is the starting uid/gid in the container and *size* refers to the number of ids to be mapped.
52
+
There is a limit of 5 mappings which is the Linux kernel hard limit.
Copy file name to clipboardExpand all lines: config.go
+7-29Lines changed: 7 additions & 29 deletions
Original file line number
Diff line number
Diff line change
@@ -14,30 +14,7 @@ type Spec struct {
14
14
// Hostname is the container's host name.
15
15
Hostnamestring`json:"hostname"`
16
16
// Mounts profile configuration for adding mounts to the container's filesystem.
17
-
Mounts []Mount`json:"mounts"`
18
-
// Hooks are the commands run at various lifecycle events of the container.
19
-
HooksHooks`json:"hooks"`
20
-
}
21
-
22
-
typeHooksstruct {
23
-
// Prestart is a list of hooks to be run before the container process is executed.
24
-
// On Linux, they are run after the container namespaces are created.
25
-
Prestart []Hook`json:"prestart"`
26
-
// Poststop is a list of hooks to be run after the container process exits.
27
-
Poststop []Hook`json:"poststop"`
28
-
}
29
-
30
-
// Mount specifies a mount for a container.
31
-
typeMountstruct {
32
-
// Type specifies the mount kind.
33
-
Typestring`json:"type"`
34
-
// Source specifies the source path of the mount. In the case of bind mounts on
35
-
// linux based systems this would be the file on the host.
36
-
Sourcestring`json:"source"`
37
-
// Destination is the path where the mount will be placed relative to the container's root.
38
-
Destinationstring`json:"destination"`
39
-
// Options are fstab style mount options.
40
-
Optionsstring`json:"options"`
17
+
MountPoints []MountPoint`json:"mounts"`
41
18
}
42
19
43
20
// Process contains information to start a specific application inside the container.
@@ -72,9 +49,10 @@ type Platform struct {
72
49
Archstring`json:"arch"`
73
50
}
74
51
75
-
// Hook specifies a command that is run at a particular event in the lifecycle of a container.
76
-
typeHookstruct {
77
-
Pathstring`json:"path"`
78
-
Args []string`json:"args"`
79
-
Env []string`json:"env"`
52
+
// MountPoint describes a directory that may be fullfilled by a mount in the runtime.json.
53
+
typeMountPointstruct {
54
+
// Name is a unique descriptive identifier for this mount point.
55
+
Namestring`json:"name"`
56
+
// Path specifies the path of the mount. The path and child directories MUST exist, a runtime MUST NOT create directories automatically to a mount point.
Copy file name to clipboardExpand all lines: config.md
+1-56Lines changed: 1 addition & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Configuration file
2
2
3
-
The container’s top-level directory MUST contain a configuration file called `config.json`.
3
+
The container's top-level directory MUST contain a configuration file called `config.json`.
4
4
For now the canonical schema is defined in [spec.go](spec.go) and [spec_linux.go](spec_linux.go), but this will be moved to a formal JSON schema over time.
5
5
6
6
The configuration file contains metadata necessary to implement standard operations against the container.
@@ -34,61 +34,6 @@ Each container has exactly one *root filesystem*, specified in the *root* object
34
34
}
35
35
```
36
36
37
-
## Mount Configuration
38
-
39
-
Additional filesystems can be declared as "mounts", specified in the *mounts* array. The parameters are similar to the ones in Linux mount system call. [http://linux.die.net/man/2/mount](http://linux.die.net/man/2/mount)
40
-
41
-
***type** (string, required) Linux, *filesystemtype* argument supported by the kernel are listed in */proc/filesystems* (e.g., "minix", "ext2", "ext3", "jfs", "xfs", "reiserfs", "msdos", "proc", "nfs", "iso9660"). Windows: ntfs
42
-
***source** (string, required) a device name, but can also be a directory name or a dummy. Windows, the volume name that is the target of the mount point. \\?\Volume\{GUID}\ (on Windows source is called target)
43
-
***destination** (string, required) where the source filesystem is mounted relative to the container rootfs.
44
-
***options** (string, optional) in the fstab format [https://wiki.archlinux.org/index.php/Fstab](https://wiki.archlinux.org/index.php/Fstab).
"destination": "C:\\Users\\crosbymichael\\My Fancy Mount Point\\",
85
-
"options": ""
86
-
}
87
-
]
88
-
```
89
-
90
-
See links for details about [mountvol](http://ss64.com/nt/mountvol.html) and [SetVolumeMountPoint](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365561(v=vs.85).aspx) in Windows.
91
-
92
37
## Process configuration
93
38
94
39
***terminal** (bool, optional) specifies whether you want a terminal attached to that process. Defaults to false.
0 commit comments