-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Intel RDT/CAT support for OCI/runc and Docker #433
Comments
Sure, we can add the support to runc once the spec PR is merged. |
+1 |
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/10/2/74 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max CBM is 20 bits L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/10/2/74 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max CBM is 20 bits L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/10/2/74 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max CBM is 20 bits L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/10/2/74 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/10/2/74 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/12/17/574 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/12/17/574 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/12/17/574 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). More information can be found in the section 17.16 of Intel Software Developer Manual. About intel_rdt cgroup: Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt' with kernel config CONFIG_INTEL_RDT. The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm' which represents the L3 cache capacity bitmask (CBM). The CBM needs to have only *contiguous bits set* and number of bits that can be set is less than the max bits. The max bits in the CBM is varied among supported Intel platforms. The tasks belonging to a cgroup get to fill in the L3 cache represented by the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size is 10MB, each bit represents 1MB of the L3 cache capacity. Root cgroup always has all the bits set in the l3_cbm. User can create more cgroups with mkdir syscall. By default the child cgroups inherit the CBM from parent. User can change the CBM specified in hex for each cgroup. For more information about intel_rdt cgroup: https://lkml.org/lkml/2015/12/17/574 An example: Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20 L3 cache size: 55 MB This assigns 11 MB (1/5) of L3 cache to the child group: $ /bin/echo 0xf > intel_rdt.l3_cbm Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
@oci maintainers @mrunalp @vishh @hqhq @LK4D4 @crosbymichael @philips @vbatts |
@xiaochenshen I won't be there unfortunately (university exams). I will be speaking at ContainerCon Japan (13-15 July 2016) about the rootless container stuff we're doing in runC, so if you're going to that we can meet face-to-face. /cc @opencontainers/runc-maintainers |
I'll be there, welcome to stop by Huawei's booth, you'll probably find me there and we can have a talk. |
@hqhq Thanks, see you on DockerCon. |
@cyphar Thanks. I am interested in rootless container. But I am not sure if I can attend ContainerCon Japan then. |
I'll post the talk slides and link to the talk recording on the dev@opencontainers.org mailing list once they're up. |
@crosbymichael @hqhq Nice meeting you in DockerCon! And thank you for your suggestions. Intel RDT CAT kernel patch is subject to change to non-cgroup interface for some reasons. This proposal will be changed accordingly. But I will figure out if we can still keep "runtime resource constraints" structure which is aligned with OCI runtime-spec. |
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: Intel Cache Allocation Technology (CAT) is a sub-feature of Resource Director Technology (RDT), which currently supports L3 cache resource allocation. In Linux kernel, it is exposed via "resource control" filesystem, which is a "cgroup-like" interface. Intel RDT "resource control" filesystem hierarchy: /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation masks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` Which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu mask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. Comparing with cgroups, intelRdt has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. When intelRdt is joined, the statistics can be collected from a container. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: Intel Cache Allocation Technology (CAT) is a sub-feature of Resource Director Technology (RDT), which currently supports L3 cache resource allocation. In Linux kernel, it is exposed via "resource control" filesystem, which is a "cgroup-like" interface. Intel RDT "resource control" filesystem hierarchy: /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. Comparing with cgroups, intelRdt has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. When intelRdt is joined, the statistics can be collected from a container. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: In Linux kernel, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t rscctrl rscctrl /sys/fs/rscctrl tree /sys/fs/rscctrl /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: In Linux kernel, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t rscctrl rscctrl /sys/fs/rscctrl tree /sys/fs/rscctrl /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: In Linux kernel, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t rscctrl rscctrl /sys/fs/rscctrl tree /sys/fs/rscctrl /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: In Linux kernel, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t rscctrl rscctrl /sys/fs/rscctrl tree /sys/fs/rscctrl /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: In Linux kernel, the interface is defined and exposed via "resource control" filesystem, which is a "cgroup-like" interface. Comparing with cgroups, it has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. Intel RDT "resource control" filesystem hierarchy: mount -t rscctrl rscctrl /sys/fs/rscctrl tree /sys/fs/rscctrl /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
@cyphar @crosbymichael @hqhq @mrunalp @vishh Design proposal updates (2017-01-18)To address @crosbymichael and @cyphar 's comments #1198 (comment) and #1198 (comment), the design is updated: It adds a new "ResourceManager" structure as the base interface for all resource managers, such as cgroups manager and incoming IntelRdt manager. All registered resource managers are consolidated in linuxContainer structure. We can apply to unified operations (e.g., Apply(), Set(), Destroy()) using all of the registered resource managers. Currently, cgroups manager is the single resource manager in libcontainer. Linux kernel 4.10 will introduce Intel RDT/CAT feature, the kernel interface is exposed via "resource control" filesystem, which is a cgroup-like interface. In order to support Intel RDT/CAT in libcontainer, we need a new resource manager (IntelRdt manager) outside cgroups. The PRs to implement the design: |
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This is really good & interesting work & I'm glad it's happening. This topic is perhaps more appropriate for somelike like LKML, but I do think it's very scary that RDT is implemented as a parallel resource controller to cgroups. From a total laymen's perspective, it seems horrifically sad that it was not implemented within the existing cgroups framing. Quote:
Should be amazingly useful tech to push load sharing to far greater heights, but it really disturbs me a lot that it's an entirely parallel system to what Linux and containers have built themselves upon so far, cgroups. |
@rektide Frankly, most people in container world (including me) likes cgroup rather than another kernel interface. But as a fait accompli, the CAT kernel patch with resource control filesystem interface have been merged into Linux upstream kernel in 4.10. What we are working on this issue is to enable Intel RDT/CAT feature in runC based on the new Linux kernel interface. |
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add support for Intel Resource Director Technology (RDT) / Cache Allocation Technology (CAT). Add L3 cache resource constraints in Linux-specific configuration. This is the prerequisite of this runc proposal: opencontainers/runc#433 For more information about Intel RDT/CAT, please refer to: opencontainers/runc#433 Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This was raised during reviews with folks working on Windows Containers. This squashes commits from PR opencontainers#433 Signed-off-by: Rob Dolin <RobDolin@microsoft.com>
@xiaochenshen |
The cache allocation limit depends on Intel CPU models. You could get the read-only info from /sys/fs/resctrl/info/L3/num_closids. And you could create up to (num_closios - 1) RDT CTRL_MON groups because 1 closid has been reserved to root group. For more details, please refer to Linux kernel Intel RDT documentation: |
Thanks for your kindly replies. |
Thanks for pointing out this. I have thought about this before. I know RDT group sharing between containers is a real user scenario. I have added it to my TODO list, Hope we can find a tradeoff solution in future. |
I think this can be closed since it has been implemented |
Thank you, Michael.
|
Status: Intel RDT/CAT support for OCI and Docker software stack
Intel RDT/CAT support in OCI (merged PRs):
1. Intel RDT/CAT support in OCI/runtime-spec
opencontainers/runtime-spec#630
opencontainers/runtime-spec#787
opencontainers/runtime-spec#889
opencontainers/runtime-spec#988
2. Intel RDT/CAT support in OCI/runc
#1279
#1589
#1590
#1615
#1894
#1913
#1930
#1955
#2042
TODO list - Intel RDT/CAT support in Docker:
3. Intel RDT/CAT support in containerd
4. Intel RDT/CAT support in Docker Engine (moby/moby)
5. Intel RDT/CAT support in Docker CLI
What is Intel RDT and CAT:
Intel Cache Allocation Technology (CAT) is a sub-feature of Resource Director Technology (RDT). Currently L3 Cache is the only resource that is supported in RDT.
Cache Allocation Technology offers the capability of L3 cache Quality of Service (QoS). It provides a way for the Software (OS/VMM/Container) to restrict cache allocation to a defined 'subset' of cache which may be overlapping with other 'subsets'. This feature is used when allocating a line in cache i.e. when pulling new data into the cache. The programming of the h/w is done via PQR MSRs.
The different cache subsets are identified by CLOS (class of service) identifier and each CLOS has a CBM (cache bit mask). The CBM is a contiguous set of bits which defines the amount of cache resource that is available for each 'subset'.
More information can be found in the section 17.17 of Intel Software Developer Manual and Intel RDT Homepage.
Supported Intel Xeon CPU SKUs:
To check if cache allocation was enabled:
$ cat /proc/cpuinfo
Check if output have 'rdt_a' and 'cat_l3' flags.
Why is Cache Allocation needed:
Cache Allocation Technology is useful in managing large computer server systems with large size L3 cache, in the cloud and container context. Examples may be large servers running instances of webservers or database servers. In such complex systems, these subsets can be used for more careful placing of the available cache resources by a centralized root accessible interface.
The architecture also allows dynamically changing these subsets during runtime to further optimize the performance of the higher priority application with minimal degradation to the low priority app. Additionally, resources can be rebalanced for system throughput benefit.
User cases for container:
Figure 1:
Note: Figure 1 is fetched from section 17.17 of Intel Software Developer Manual.
Currently the Last Level Cache (LLC) in Intel Xeon platforms is L3 cache. So LLC == L3 cache here.
Noisy neighbor issue:
A typical use case is to solve the noisy neighbor issue in container environment. For example, when a streaming application which running in a container is constantly copying data and accessing linear space larger than L3 cache, and hence evicting a large amount of cache which could have otherwise been used by a higher priority computing application which running in another container.
Using the cache allocation feature, the 'noisy neighbors' container which running the streaming application can be confined to use a smaller cache, and the higher priority application be awarded a larger amount of L3 cache space.
L3 cache QoS:
Another key user scenario is in large-scale container clusters context. A central scheduler or orchestrator would control resource allocations to a set of containers. Docker and runc can make use of libcontainer to manage resources. They could benefit from Intel RDT cache allocation feature for new resource constraints. We could define different cache subsets strategies through setting different CLOS/CBM in containers' runtime configuration. As a result, we could achieve fine-grained L3 cache QoS (quality of service) among containers.
Linux kernel interface for Intel RDT/CAT:
In Linux 4.10 kernel and newer, Intel RDT/CAT will be supported with kernel config CONFIG_INTEL_RDT_A. In Linux 5.1 kernel and newer, with kernel config CONFIG_X86_CPU_RESCTRL.
Originally, the kernel interface for Intel RDT/CAT is
intel_rdt
cgroup, but the cgroup solution is rejected by kernel cgroup maintainer for some reasons, such as incompatibility with cgroup hierarchy, limitations for some corner cases and etc.Currently, a new kernel interface is defined and exposed via
"resource control" filesystem
, which is a "cgroup-like" interface. The new design aligns better with the hardware capabilities provided, and addresses the issues in cgroup based interface.Comparing with cgroups, the interface has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout.
Intel RDT "resource control" filesystem hierarchy:
For runc, we can make use of
tasks
andschemata
configuration for L3 cache resource constraints.The file
tasks
has a list of tasks that belongs to this group (e.g., <container_id>" group). Tasks can be added to a group by writing the task ID to the "tasks" file (which will automatically remove them from the previous group to which they belonged). New tasks created by fork(2) and clone(2) are added to the same group as their parent. If a pid is not in any sub group, it is in root group.The file
schemata
has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM).For example, on a two-socket machine, L3's schema line could be
L3:0=ff;1=c0
which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0.The valid L3 cache CBM is a contiguous bits set and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a group should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a group: 0xf, 0xf0, 0x3ff, 0x1f00 and etc.
For more information about Intel RDT/CAT kernel interface:
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
An example for runc:
Proposal and design - components:
1. Intel RDT/CAT support in OCI/runtime-spec:
Status: PR opencontainers/runtime-spec#630 has been merged.
This is the prerequisite of this proposal.
config.json
.2. Intel RDT/CAT support in OCI/runc
Status: PR #1279 has been merged.
This is the prerequisite of this proposal. It mainly focused on Intel RDT/CAT infrastructure support in runc/libcontainer:
package intelrdt
as a new infrastructure in libcontainer. It implementsIntelRdtManager
interface to handle intelrdt framework:intelRdtManager
inlinuxContainer struct
, and invoke Intel RDT/CAT operations in process management (initProcess
,setnsProcess
) functions:LinuxFactory
to return containers which could create and manage Intel RDT/CAT L3 cache resources:TODO list - Intel RDT/CAT support in Docker
3. Intel RDT/CAT support in containerd
4. Intel RDT/CAT support in Docker Engine (moby/moby)
5. Intel RDT/CAT support in Docker CLI
When Intel RDT/CAT is ready in libcontainer, Docker could naturally make use of libcontainer to support L3 cache allocation for container resource management. Some potential work to do in Docker:
docker run
options to support Intel RDT/CAT.docker client/daemon
APIs to support Intel RDT/CAT.containerd
.docker engine
.docker stats
.TODO list - Intel RDT/CDP support in runc
As a specialized extension of CAT, Code and Data Prioritization (CDP) enables separate control over code and data placement in the L3 cache. Certain specialized types of workloads may benefit with increased runtime determinism, enabling greater predictability in application performance.
The Linux kernel CDP patch is part of CAT patch series. We can also add the functionality in runc.
Obsolete design which based on cgroup interface (for backup only)
The following content is kept only for reference. The original design based on kernel
cgroup
interface will be obsolete for kernel cgroup interface patch is rejected.The text was updated successfully, but these errors were encountered: