Skip to content
This repository was archived by the owner on Oct 22, 2024. It is now read-only.

Commit f5e4157

Browse files
committed
README.md: describe features of raw block volumes, update example
Our documentation did not specify how exactly PMEM-CSI provides raw block volumes and thus what users can expect from them. Actually using them also isn't trivial. This now gets explained in the documentation and in the updated example. Ubuntu is used as base image because the "mknod" in busybox does not accept hex device numbers as printed by "stat" and because mkfs.ext4 is in the image. With Clear Linux, mkfs.ext4 would have to be installed from a rather large bundle.
1 parent 5834f75 commit f5e4157

File tree

2 files changed

+50
-8
lines changed

2 files changed

+50
-8
lines changed

README.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -775,13 +775,33 @@ parameters:
775775
776776
#### Note about raw block volumes
777777
778-
Applications can use volumes provisioned by PMEM-CSI as
779-
[raw block devices](https://kubernetes.io/blog/2019/03/07/raw-block-volume-support-to-beta/).
778+
Applications can use volumes provisioned by PMEM-CSI as [raw block
779+
devices](https://kubernetes.io/blog/2019/03/07/raw-block-volume-support-to-beta/). Such
780+
volumes use the same "fsdax" namespace mode as filesystem volumes
781+
and therefore are block devices. That mode only supports dax (=
782+
`mmap(MAP_SYNC)`) through a filesystem. Pages mapped on the raw block
783+
device go through the Linux page cache. Applications have to format
784+
and mount the raw block volume themselves if they want dax. The
785+
advantage then is that they have full control over that part.
786+
787+
Volumes with "devdax" namespace mode are not supported. That mode
788+
results in a character device, something that Kubernetes does not
789+
expect from a storage driver and that does not work because Kubernetes
790+
internally tries to [bind the device to a loop
791+
device](https://github.com/kubernetes/kubernetes/blob/7c87b5fb55ca096c007c8739d4657a5a4e29fb09/pkg/volume/util/util.go#L531-L534).
792+
780793
For provisioning a PMEM volume as raw block device, one has to create a
781794
`PersistentVolumeClaim` with `volumeMode: Block`. See example [PVC](
782795
deploy/common/pmem-pvc-block-volume.yaml) and
783796
[application](deploy/common/pmem-app-block-volume.yaml) for usage reference.
784797
798+
That example demonstrates how to handle some details:
799+
- `mkfs.ext4` needs `-b 4096` to produce volumes that support dax;
800+
without it, the automatic block size detection may end up choosing
801+
an unsuitable value depending on the volume size.
802+
- [Kubernetes bug #85624](https://github.com/kubernetes/kubernetes/issues/85624)
803+
must be worked around to format and mount the raw block device.
804+
785805
<!-- FILL TEMPLATE:
786806
787807
### How to extend the plugin

deploy/common/pmem-app-block-volume.yaml

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,38 @@ apiVersion: v1
33
metadata:
44
name: my-csi-app
55
spec:
6+
initContainers:
7+
# This init container is a workaround for https://github.com/kubernetes/kubernetes/issues/85624.
8+
- name: store-device
9+
image: ubuntu
10+
command:
11+
- "sh"
12+
- "-c"
13+
- "(echo '#!/bin/sh' && stat -c 'mknod /dev-xpmem b 0x%t 0x%T' /dev-xpmem) >/data/create-dev.sh && chmod a+x /data/create-dev.sh"
14+
volumeMounts:
15+
- name: data
16+
mountPath: /data
17+
volumeDevices:
18+
- name: my-csi-device
19+
devicePath: /dev-xpmem
620
containers:
721
- name: my-frontend
8-
image: busybox
9-
command: [ "sleep", "100000" ]
10-
volumeDevices:
11-
- devicePath: "/dev/xpmem"
12-
name: my-csi-volume
22+
image: ubuntu
23+
securityContext:
24+
privileged: True
25+
command:
26+
- "sh"
27+
- "-c"
28+
# mkfs.ext4 may fail here if the volume was already formatted before, so we ignore the return code.
29+
- "if [ ! -e /dev-xpmem ]; then /data/create-dev.sh; fi && mkfs.ext4 -b 4096 /dev-xpmem; mkdir -p /mnt && mount -odax /dev-xpmem /mnt && mount | grep /mnt | grep dax && sleep 100000"
30+
volumeMounts:
31+
- name: data
32+
mountPath: /data
1333
nodeSelector:
1434
storage: pmem
1535
volumes:
16-
- name: my-csi-volume
36+
- name: my-csi-device
1737
persistentVolumeClaim:
1838
claimName: pmem-csi-pvc-block-volume
39+
- name: data
40+
emptyDir:

0 commit comments

Comments
 (0)