You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
config: Make capabilities, noNewPrivileges, and rlimits Linux-only (again)
Roll back the genericization from 718f9f3 (minor narrative cleanup
regarding config compatibility, 2017-01-30, opencontainers#673). Lifting the
restriction there seems to have been motivated by "Solaris supports
capabilities", but that was before the split into a capabilities
object which happened in eb114f0 (Add ambient and bounding capability
support, 2017-02-02, opencontainers#675). It's not clear if Solaris supports
ambient caps, or what Solaris API rlimits or noNewPrivileges were
punting to [1]. And John Howard has recently confirmed that Windows
does not support capabilities and is unlikely to do so in the future
[2]. John's statement didn't directly address rlimits or
noNewPrivileges, but we can always restore any of these properties to
the Solaris/Windows platforms if/when we get docs about which API
we're punting to on those platforms.
Also add some backticks, remove the hyphens in "OPTIONAL) - the",
standardize lines I touch to use "the process" [3], and use four-space
indents here to keep Pandoc happy (see 7795661 (runtime.md: Fix
sub-bullet indentation, 2016-06-08, opencontainers#495).
[1]: opencontainers#673 (comment)
[2]: opencontainers#810 (comment)
[3]: opencontainers#809 (comment)
Signed-off-by: W. Trevor King <wking@tremily.us>
Copy file name to clipboardExpand all lines: config.md
+21-18Lines changed: 21 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -130,35 +130,38 @@ For Solaris, the mount entry corresponds to the 'fs' resource in the [zonecfg(1M
130
130
***`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2001's `environ`][ieee-1003.1-2001-xbd-c8.1].
131
131
***`args`** (array of strings, REQUIRED) with similar semantics to [IEEE Std 1003.1-2001 `execvp`'s *argv*][ieee-1003.1-2001-xsh-exec].
132
132
This specification extends the IEEE standard in that at least one entry is REQUIRED, and that entry is used with the same semantics as `execvp`'s *file*.
133
-
***`capabilities`** (object, OPTIONAL) is an object containing arrays that specifies the sets of capabilities for the process(es) inside the container. Valid values are platform-specific. For example, valid values for Linux are defined in the [capabilities(7)][capabilities.7] man page, such as `CAP_CHOWN`. Any value which cannot be mapped to a relevant kernel interface MUST cause an error.
134
-
capabilities contains the following properties:
135
-
***`effective`** (array of strings, OPTIONAL) - the `effective` field is an array of effective capabilities that are kept for the process.
136
-
***`bounding`** (array of strings, OPTIONAL) - the `bounding` field is an array of bounding capabilities that are kept for the process.
137
-
***`inheritable`** (array of strings, OPTIONAL) - the `inheritable` field is an array of inheritable capabilities that are kept for the process.
138
-
***`permitted`** (array of strings, OPTIONAL) - the `permitted` field is an array of permitted capabilities that are kept for the process.
139
-
***`ambient`** (array of strings, OPTIONAL) - the `ambient` field is an array of ambient capabilities that are kept for the process.
140
-
***`rlimits`** (array of objects, OPTIONAL) allows setting resource limits for a process inside the container.
141
-
Each entry has the following structure:
142
-
143
-
***`type`** (string, REQUIRED) - the platform resource being limited, for example on Linux as defined in the [setrlimit(2)][setrlimit.2] man page.
144
-
***`soft`** (uint64, REQUIRED) - the value of the limit enforced for the corresponding resource.
145
-
***`hard`** (uint64, REQUIRED) - the ceiling for the soft limit that could be set by an unprivileged process. Only a privileged process (e.g. under Linux: one with the CAP_SYS_RESOURCE capability) can raise a hard limit.
146
-
147
-
If `rlimits` contains duplicated entries with same `type`, the runtime MUST error out.
148
-
149
-
***`noNewPrivileges`** (bool, OPTIONAL) setting `noNewPrivileges` to true prevents the processes in the container from gaining additional privileges.
150
-
As an example, the ['no_new_privs'][no-new-privs] article in the kernel documentation has information on how this is achieved using a prctl system call on Linux.
151
133
152
134
For Linux-based systems the process structure supports the following process-specific fields.
153
135
154
136
***`apparmorProfile`** (string, OPTIONAL) specifies the name of the AppArmor profile to be applied to processes in the container.
155
137
For more information about AppArmor, see [AppArmor documentation][apparmor].
138
+
***`capabilities`** (object, OPTIONAL) is an object containing arrays that specifies the sets of capabilities for the process.
139
+
Valid values are defined in the [capabilities(7)][capabilities.7] man page, such as `CAP_CHOWN`.
140
+
Any value which cannot be mapped to a relevant kernel interface MUST cause an error.
141
+
`capabilities` contains the following properties:
142
+
143
+
***`effective`** (array of strings, OPTIONAL) the `effective` field is an array of effective capabilities that are kept for the process.
144
+
***`bounding`** (array of strings, OPTIONAL) the `bounding` field is an array of bounding capabilities that are kept for the process.
145
+
***`inheritable`** (array of strings, OPTIONAL) the `inheritable` field is an array of inheritable capabilities that are kept for the process.
146
+
***`permitted`** (array of strings, OPTIONAL) the `permitted` field is an array of permitted capabilities that are kept for the process.
147
+
***`ambient`** (array of strings, OPTIONAL) the `ambient` field is an array of ambient capabilities that are kept for the process.
148
+
***`noNewPrivileges`** (bool, OPTIONAL) setting `noNewPrivileges` to true prevents the process from gaining additional privileges.
149
+
As an example, the [`no_new_privs`][no-new-privs] article in the kernel documentation has information on how this is achieved using a `prctl` system call on Linux.
156
150
***`oomScoreAdj`***(int, OPTIONAL)* adjusts the oom-killer score in `[pid]/oom_score_adj` for the container process's `[pid]` in a [proc pseudo-filesystem][procfs].
157
151
If `oomScoreAdj` is set, the runtime MUST set `oom_score_adj` to the given value.
158
152
If `oomScoreAdj` is not set, the runtime MUST NOT change the value of `oom_score_adj`.
159
153
160
154
This is a per-process setting, where as [`disableOOMKiller`](config-linux.md#disable-out-of-memory-killer) is scoped for a memory cgroup.
161
155
For more information on how these two settings work together, see [the memory cgroup documentation section 10. OOM Contol][cgroup-v1-memory_2].
156
+
***`rlimits`** (array of objects, OPTIONAL) allows setting resource limits for the process.
157
+
Each entry has the following structure:
158
+
159
+
***`type`** (string, REQUIRED) the platform resource being limited as defined in the [`setrlimit(2)`][setrlimit.2] man page.
160
+
***`soft`** (uint64, REQUIRED) the value of the limit enforced for the corresponding resource.
161
+
***`hard`** (uint64, REQUIRED) the ceiling for the soft limit that could be set by an unprivileged process.
162
+
Only a privileged process (e.g. one with the `CAP_SYS_RESOURCE` capability) can raise a hard limit.
163
+
164
+
If `rlimits` contains duplicated entries with same `type`, the runtime MUST error out.
162
165
***`selinuxLabel`** (string, OPTIONAL) specifies the SELinux label to be applied to the processes in the container.
163
166
For more information about SELinux, see [SELinux documentation][selinux].
0 commit comments