Skip to content

Issue with user/group id mapping for projected volumes sysbox 0.6.2 on ubuntu 22.04 #728

@nclaeys

Description

@nclaeys

I am encountering issues when I want to use OIDC tokens on Azure nodes.
These tokens are mounted to the pod using a projectedVolumeSource, full yaml of the pod is as follows:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    io.kubernetes.cri-o.userns-mode: auto:size=65536
    karpenter.sh/do-not-evict: "true"
    runtime.datafy.cloud/DatafyInstanceLifecycle: on-demand
    runtime.datafy.cloud/DatafyInstanceType: mx.xlarge
  creationTimestamp: "2023-08-21T13:19:55Z"
  generateName: ide-ffffe137-1719-4b93-80d2-d356cd5f522c-
  labels:
    app.kubernetes.io/managed-by: conveyor-operator
    controller-revision-hash: ide-ffffe137-1719-4b93-80d2-d356cd5f522c-7cf75685bd
    ide.datafy.cloud/ideCRD: 0cb70806-7127-475f-b2e2-a625f6f59a9a
    ide.datafy.cloud/statefulSet: ide-ffffe137-1719-4b93-80d2-d356cd5f522c
    launched-by: datafy-ide-operator
    statefulset.kubernetes.io/pod-name: ide-ffffe137-1719-4b93-80d2-d356cd5f522c-0
  name: ide-ffffe137-1719-4b93-80d2-d356cd5f522c-0
  namespace: ncazure
spec:
  automountServiceAccountToken: false
  containers:
  - command:
    - /sbin/init
    - --log-level=err
    env:
    - name: SYSBOX_ALLOW_TRUSTED_XATTR
      value: "FALSE"
    - name: AZURE_CLIENT_ID
      value: e4fe099a-a07e-429f-a985-6cf795d80f00
    - name: AZURE_FEDERATED_TOKEN_FILE
      value: /var/run/secrets/tokens/azure-identity-token
    - name: AZURE_AUTHORITY_HOST
      value: https://login.microsoftonline.com/
    - name: AZURE_TENANT_ID
      value: 55226c2c-0b83-4621-a5cd-e8e0e57ec920
    image: datafydpdevanc.azurecr.io/datafy/data-plane/project/azurepython:snapshot-0cb70806-7127-475f-b2e2-a625f6f59a9a
    imagePullPolicy: Always
    name: ide
    ports:
    - containerPort: 8000
      name: ide-port
      protocol: TCP
    readinessProbe:
      failureThreshold: 6
      httpGet:
        path: /environments/ncazure/ide/0cb70806-7127-475f-b2e2-a625f6f59a9a/healthz
        port: ide-port
        scheme: HTTP
      initialDelaySeconds: 5
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: "4"
        memory: 12028755Ki
      requests:
        cpu: 3400m
        memory: 12028755Ki
    securityContext:
      allowPrivilegeEscalation: true
      privileged: false
      runAsGroup: 0
      runAsNonRoot: false
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/ide-mgr/var_file
      name: configmap
      readOnly: true
      subPath: ide-mgr_var_file
    - mountPath: /etc/code-server/var_file
      name: configmap
      readOnly: true
      subPath: code-server_var_file
    - mountPath: /tmp
      name: tmp
    - mountPath: /var/run/secrets/tokens
      name: azure-identity-token
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: false
  hostname: ide-ffffe137-1719-4b93-80d2-d356cd5f522c-0
  imagePullSecrets:
  - name: environment-ncazure-ecr-credentials
  nodeName: aks-ide4s51266-64819984-vmss000005
  nodeSelector:
    node.kubernetes.io/datafy-instance-class: mx.xlarge
    node.kubernetes.io/lifecycle: on-demand
    sysbox-runc: "true"
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  runtimeClassName: sysbox-runc
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1000
    runAsGroup: 0
    runAsNonRoot: false
    runAsUser: 0
  serviceAccount: azurepython
  serviceAccountName: azurepython
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: datafy.com/sysbox-enabled
    operator: Equal
    value: "true"
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: datafy-instance-class
    operator: Equal
    value: mx.xlarge
  - effect: NoSchedule
    key: node.kubernetes.io/lifecycle
    operator: Equal
    value: on-demand
  volumes:
  - configMap:
      defaultMode: 420
      name: ide-0cb70806-7127-475f-b2e2-a625f6f59a9a
    name: configmap
  - emptyDir: {}
    name: tmp
  - name: azure-identity-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: api://AzureADTokenExchange
          expirationSeconds: 3600
          path: azure-identity-token
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-08-21T13:19:55Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-08-21T13:20:40Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-08-21T13:20:40Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-08-21T13:19:55Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://fe6e14cdcf26066528ec24d0f723b012118b25f03f4b8f96030c4f080c04995f
    image: datafydpdevanc.azurecr.io/datafy/data-plane/project/azurepython:snapshot-0cb70806-7127-475f-b2e2-a625f6f59a9a
    imageID: datafydpdevanc.azurecr.io/datafy/data-plane/project/azurepython@sha256:1ef4acadb9fa0ed30057f21bcce65cb97339abaf72267e9c4e02d8275a1d2d2d
    lastState: {}
    name: ide
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-08-21T13:20:34Z"
  hostIP: 10.0.4.6
  phase: Running
  podIP: 100.96.1.103
  podIPs:
  - ip: 100.96.1.103
  qosClass: Burstable
  startTime: "2023-08-21T13:19:55Z"

When I want to use the azure tokens to access azure resources I get the error that I am not allowed to read the tokens file.
Inspecting the group and user id of the respective files I see the following:

Screenshot from 2023-08-21 14-44-27

Here you see that the volumes that are mounted from configmaps/emptyDirs is working correctly but for the azureTokens it is not. For this last case the user and group id are: nobody and nogroup but I do not understand why this is happening.
Next are the logs from the node, regarding the startup of the ide pod:

fullLogsNodePodStartup.txt

I am using sysbox 0.6.2 on our Azure aks cluster, which uses ubuntu with the following details:

# uname -a
Linux aks-ide4s51266-64819984-vmss000005 5.15.0-1042-azure #49-Ubuntu SMP Tue Jul 11 17:28:46 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/lsb-release	
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.2 LTS"

Sysbox is installed using the sysbox-deploy-k8s daemonset.

Extra full node logs:
fullNodeLogs.txt

Any tips/suggestions on what is going wrong here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions