Skip to content

Commit

Permalink
feat: add nvidia-container-runtime
Browse files Browse the repository at this point in the history
Signed-off-by: Andrew Rynhard <andrew@rynhard.io>
Signed-off-by: Noel Georgi <git@frezbo.dev>
  • Loading branch information
andrewrynhard authored and frezbo committed Mar 25, 2022
1 parent 495cabb commit 215aa82
Show file tree
Hide file tree
Showing 26 changed files with 707 additions and 22 deletions.
15 changes: 3 additions & 12 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
REGISTRY ?= ghcr.io
USERNAME ?= talos-systems
USERNAME ?= siderolabs
SHA ?= $(shell git describe --match=none --always --abbrev=8 --dirty)
TAG ?= $(shell git describe --tag --always --dirty)
BRANCH ?= $(shell git rev-parse --abbrev-ref HEAD)
Expand All @@ -20,7 +20,7 @@ empty :=
space = $(empty) $(empty)

TARGETS = amd-ucode bnx2-bnx2x gvisor hello-world-service intel-ucode
NONFREE_TARGETS =
NONFREE_TARGETS = nvidia-container-toolkit

all: $(TARGETS) ## Builds all known pkgs.

Expand Down Expand Up @@ -51,15 +51,6 @@ $(TARGETS) $(NONFREE_TARGETS):
deps.png:
bldr graph | dot -Tpng > deps.png

kernel-%: ## Updates the kernel configs: e.g. make kernel-olddefconfig; make kernel-menuconfig; etc.
for platform in $(subst $(,),$(space),$(PLATFORM)); do \
arch=`basename $$platform` ; \
$(MAKE) docker-kernel-prepare PLATFORM=$$platform TARGET_ARGS="--tag=$(REGISTRY)/$(USERNAME)/kernel:$(TAG)-$$arch --load"; \
docker run --rm -it --entrypoint=/toolchain/bin/bash -e PATH=/toolchain/bin:/bin -w /src -v $$PWD/kernel/build/config-$$arch:/host/.hostconfig $(REGISTRY)/$(USERNAME)/kernel:$(TAG)-$$arch -c 'cp /host/.hostconfig .config && make $* && cp .config /host/.hostconfig'; \
done

# Utilities

.PHONY: conformance
conformance: ## Performs policy checks against the commit and source code.
docker run --rm -it -v $(PWD):/src -w /src ghcr.io/talos-systems/conform:v0.1.0-alpha.22 enforce
docker run --rm -it -v $(PWD):/src -w /src ghcr.io/siderolabs/conform:latest enforce
10 changes: 6 additions & 4 deletions Pkgfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# syntax = ghcr.io/talos-systems/bldr:v0.2.0-alpha.6-frontend
# syntax = ghcr.io/siderolabs/bldr:v0.2.0-alpha.7-1-g9d49478-frontend

format: v1alpha2

vars:
TOOLS_IMAGE: ghcr.io/talos-systems/tools:v0.10.0-alpha.0-5-g8197edb
LINUX_FIRMWARE_IMAGE: ghcr.io/talos-systems/linux-firmware:v0.9.0-2-g447ce75
TOOLS_IMAGE: ghcr.io/siderolabs/tools:v1.1.0-alpha.0-2-gbfc99ca
LINUX_FIRMWARE_IMAGE: ghcr.io/siderolabs/linux-firmware:v1.0.0-5-g615d1a0
NVIDIA_DRIVER_VERSION_MAJOR: 510
NVIDIA_DRIVER_VERSION_MINOR: 54

labels:
org.opencontainers.image.source: https://github.com/talos-systems/extensions
org.opencontainers.image.source: https://github.com/siderolabs/extensions
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The image is composed of a `manifest.yaml` file that provides information and co

## Building Extensions

In the current form, building extensions requires the use of our [bldr](https://github.com/talos-systems/bldr) tool.
In the current form, building extensions requires the use of our [bldr](https://github.com/siderolabs/bldr) tool.
It is highly recommended to take a look at an existing extensions as a template for building your own.
The rough flow should look like the following:

Expand Down Expand Up @@ -44,7 +44,7 @@ metadata:
### Creating `pkg.yaml`

Creating a `pkg.yaml` file is the normal process from bldr.
See instructions [here](https://github.com/talos-systems/bldr#pkgyaml) for details and examples on this format.
See instructions [here](https://github.com/siderolabs/bldr#pkgyaml) for details and examples on this format.
Using other existing extensions in this repo for tips is also highly recommended.
One important note is that the final directory tree of the generated package should look like this example from the `gvisor` package:

Expand Down
2 changes: 1 addition & 1 deletion container-runtime/gvisor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Enable the extension in the machine configuration before installing Talos:
machine:
install:
extensions:
- image: ghcr.io/talos-systems/gvisor:<VERSION>
- image: ghcr.io/siderolabs/gvisor:<VERSION>
```
gVisor requires unprivileged user namespace creation, so Talos default setting
Expand Down
2 changes: 1 addition & 1 deletion container-runtime/gvisor/runsc.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[runsc_config]
# See https://github.com/talos-systems/extensions/issues/4
# See https://github.com/siderolabs/extensions/issues/4
ignore-cgroups = "true"
2 changes: 1 addition & 1 deletion examples/hello-world-service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Enable the extension in the machine configuration before installing Talos:
machine:
install:
extensions:
- image: ghcr.io/talos-systems/hello-world-service:<VERSION>
- image: ghcr.io/siderolabs/hello-world-service:<VERSION>
```
Once this example extension is installed, it will provide simple HTTP server which responds with a message on port 80:
Expand Down
2 changes: 1 addition & 1 deletion examples/hello-world-service/src/go.mod
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
module github.com/talos-systems/hello-world
module github.com/siderolabs/hello-world

go 1.17
48 changes: 48 additions & 0 deletions nvidia-container-toolkit/DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# development

This document is intended as a guide to updating the `nvidia-container-toolkit` dependencies.

## Components

### [nvidia-container-cli](./nvidia-container-cli/)

`nvidia-container-cli` is called by the `nvidia-container-runtime` to setup the required NVIDIA library mounts and NVIDIA device files for a workload container

### [nvidia-container-runtime](./nvidia-container-runtime/)

`nvidia-container-runtime` is the runtime used by `containerd` to run workload containers. It's mostly a wrapper around `runc`

It also ships a tool called `nvidia-container-runtime-hook` which is used to setup OCI hooks, it's a symlink to `nvidia-container-toolkit`, which eventually calls `nvidia-container-cli`

### [nvidia-device-create](./nvidia-device-create/)

This is used to create the required NVIDIA device files under `/dev`. This required udev rules.

### [glibc](./glibc/)

`nvidia-container-cli` is fully dependent on `glibc` to be able to access the NVIDIA shared objects.

## Updating the nvidia driver version

- Update the driver version in `pkgs` repo [here](https://github.com/siderolabs/pkgs/blob/master/nonfree/kmod-nvidia/pkg.yaml)
- Update the driver version [here](../Pkgfile)

## Updating the nvidia-container-toolkit version

- Update the `libnvidia-container` version [here](./nvidia-container-cli/pkg.yaml)
- Update the `container-toolkit` version [here](./nvidia-container-runtime/pkg.yaml)

Make sure to also update the `nvidia-device-create` [here](./nvidia-device-create/pkg.yaml)

### Patches

- [nvidia-container-cli](./nvidia-container-cli/patches/libnvidia-container/)
- `common.h.patch` - use custom glibc interpreter path
- `Makefile.patch` - build statically linked with `libcap` and `libseccomp`
- `nvc_ldcache.c.patch` - use the standard `ld.so.cache` path inside the container
- [container-runtime](./nvidia-container-runtime/patches/nvidia-container-runtime/)
- `main.go.patch` - use custom path for the nvidia-container-runtime config
- [container-runtime](./nvidia-container-runtime/patches/nvidia-container-toolkit/)
- `hook_config.go.patch` - use custom path for the nvidia-container-runtime config
- [nvidia-device-create](./nvidia-device-create/patches/nvidia-graphics-drivers-build/)
- Makefile.patch - build statically linked with `libpciaccess`
83 changes: 83 additions & 0 deletions nvidia-container-toolkit/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# NVIDIA Container toolkit extension

## Usage

Enable the extension in the machine configuration before installing Talos:

```yaml
machine:
install:
extensions:
- image: ghcr.io/siderolabs/nvidia-container-toolkit:<VERSION>
```
The following NVIDIA modules needs to be loaded, so add this to the talos config:
```yaml
machine:
kernel:
modules:
- name: nvidia
- name: nvidia_uvm
- name: nvidia_drm
- name: nvidia_modeset
```
`nvidia-container-cli` loads BPF programs and requires relaxed KSPP setting for [bpf_jit_harden](https://sysctl-explorer.net/net/core/bpf_jit_harden/), so Talos default setting
should be overridden:

```yaml
machine:
sysctls:
net.core.bpf_jit_harden: 1
```

> Warning! This disables [KSPP best practices](https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings#sysctls) setting.

## Testing

Apply the following manifest to create a runtime class that uses the extension:

```yaml
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
```

Install the NVIDIA device plugin:

```bash
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update
helm install nvidia-device-plugin nvdp/nvidia-device-plugin --version=0.11.0 --set=runtimeClassName=nvidia
```

Apply the following manifest to run CUDA pod via nvidia runtime:

```yaml
---
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
restartPolicy: OnFailure
runtimeClassName: nvidia
containers:
- name: cuda-vector-add
image: "quay.io/giantswarm/nvidia-gpu-demo:latest"
resources:
limits:
nvidia.com/gpu: 1
```

The pod should be up and running:

```bash
❯ kubectl get pods
NAME READY STATUS RESTARTS AGE
cuda-vector-add 0/1 Completed 0 17m
```
6 changes: 6 additions & 0 deletions nvidia-container-toolkit/glibc/ld.so.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# libc default configuration
/usr/local/lib

/usr/local/glibc/lib
/usr/lib
/lib
63 changes: 63 additions & 0 deletions nvidia-container-toolkit/glibc/pkg.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: glibc
variant: scratch
shell: /bin/bash
dependencies:
- image: ubuntu:22.04
steps:
- sources:
- url: https://ftpmirror.gnu.org/libc/glibc-2.35.tar.gz
destination: glibc.tar.gz
sha256: 3e8e0c6195da8dfbd31d77c56fb8d99576fb855fafd47a9e0a895e51fd5942d4
sha512: 45bf782aeda508e17fd51b45cf5ad96bd1067cf96b758b5c2d5def681af713df15e75c253d9c85de047f0a1dd22cf4f2239d70ae392cdb9291092e6570734d43
env:
DEBIAN_FRONTEND: noninteractive
prepare:
- |
apt update && \
apt install -y \
bison \
build-essential \
gawk \
gettext \
openssl \
python3 \
texinfo
- |
mkdir -p glibc glibc-build
tar -xzf glibc.tar.gz --strip-components=1 -C glibc
build:
- |
# unset the variables bldr sets by default
unset CXXFLAGS
unset LDFLAGS
unset CFLAGS
unset TARGET
unset HOST
cd glibc-build
../glibc/configure \
--prefix=/usr/local/glibc \
--libdir=/usr/local/glibc/lib \
--libexecdir=/usr/local/glibc/lib \
--enable-stack-protector=strong
make -j $(nproc)
install:
- |
mkdir -p /rootfs
cd glibc-build
make install DESTDIR=/rootfs
cp /pkg/ld.so.conf /rootfs/usr/local/glibc/etc/ld.so.conf
# cleanup include, var and share
rm -rf /rootfs/usr/local/glibc/include
rm -rf /rootfs/usr/local/glibc/share
rm -rf /rootfs/usr/local/glibc/var
finalize:
- from: /rootfs
to: /rootfs

11 changes: 11 additions & 0 deletions nvidia-container-toolkit/manifest.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: v1alpha1
metadata:
name: nvidia-container-toolkit
# the first part is the driver version and the second the container-toolkit version
version: 510.54-v1.9.0
author: Andrew Rynhard
description: |
This system extension provides nvidia runtime and it's dependencies using NVIDIA's runtime handler.
compatibility:
talos:
version: "> v0.15.0-alpha.0"
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
diff --git Makefile Makefile
index 6fb6976..c7b9ffa 100644
--- Makefile
+++ Makefile
@@ -184,7 +184,7 @@ LIB_LDLIBS = $(LIB_LDLIBS_STATIC) $(LIB_LDLIBS_SHARED)
BIN_CPPFLAGS = -include $(BUILD_DEFS) $(CPPFLAGS)
BIN_CFLAGS = -I$(SRCS_DIR) -fPIE -flto $(CFLAGS)
BIN_LDFLAGS = -L. -pie $(LDFLAGS) -Wl,-rpath='$$ORIGIN/../$$LIB'
-BIN_LDLIBS = -l:$(LIB_SHARED) -ldl -lcap $(LDLIBS)
+BIN_LDLIBS = -l:$(LIB_STATIC) -ldl -l:libcap.a -l:libseccomp.a $(LDLIBS)

$(word 1,$(LIB_RPC_SRCS)): RPCGENFLAGS=-h
$(word 2,$(LIB_RPC_SRCS)): RPCGENFLAGS=-c
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
diff --git src/common.h src/common.h
index c91d349..461b2a5 100644
--- src/common.h
+++ src/common.h
@@ -24,7 +24,7 @@
#define LDCONFIG_PATH "/sbin/ldconfig"
#define LDCONFIG_ALT_PATH "/sbin/ldconfig.real"

-#define LIB_DIR "/lib64"
+#define LIB_DIR "/usr/local/glibc/lib"
#define USR_BIN_DIR "/usr/bin"
#define USR_LIB_DIR "/usr/lib64"
#define USR_LIB32_DIR "/usr/lib32"
@@ -33,7 +33,7 @@
#if defined(__x86_64__)
# define LIB_ARCH LD_X8664_LIB64
# define LIB32_ARCH LD_I386_LIB32
-# define USR_LIB_MULTIARCH_DIR "/usr/lib/x86_64-linux-gnu"
+# define USR_LIB_MULTIARCH_DIR "/usr/local/lib"
# define USR_LIB32_MULTIARCH_DIR "/usr/lib/i386-linux-gnu"
# if !defined(__NR_execveat)
# define __NR_execveat 322
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
diff --git src/nvc_ldcache.c src/nvc_ldcache.c
index d73d0f1..c28e982 100644
--- src/nvc_ldcache.c
+++ src/nvc_ldcache.c
@@ -349,7 +349,7 @@ nvc_ldcache_update(struct nvc_context *ctx, const struct nvc_container *cnt)
if (validate_args(ctx, cnt != NULL) < 0)
return (-1);

- argv = (char * []){cnt->cfg.ldconfig, cnt->cfg.libs_dir, cnt->cfg.libs32_dir, NULL};
+ argv = (char * []){cnt->cfg.ldconfig, cnt->cfg.libs_dir, cnt->cfg.libs32_dir, "-C", "/etc/ld.so.cache", NULL};
if (*argv[0] == '@') {
/*
* We treat this path specially to be relative to the host filesystem.
Loading

0 comments on commit 215aa82

Please sign in to comment.