Skip to content

[Kubernetes cronjob] pg_isready only works interactively (pod user permissions maybe?) #361

Open
@seano-vs

Description

Summary

TL;DR: pg_isready only seems to work when executed interactively in the pod, as opposed to when the pod is executed. This happens after I had to manually add the PGSSLMODE=require env variable because it was throwing a /root/.postgresql/postgresql.crt: Permission denied error.

Steps to reproduce

What I did was I:

  • Created the k8s job
  • Followed the steps here
  • After just setting the env variables listed above (MODE=MANUAL, MANUAL_RUN_FOREVER=FALSE, CONTAINER_ENABLE_SCHEDULING, and CONTAINER_ENABLE_MONITORING), it wouldn't start on its own (was waiting on user input).
  • Adding /etc/services.available/10-db-backup/run to the "command" field of the container definition resulted in a "no such file or directory" error.
  • I took inspiration from this comment which had me add the ['/init', 'backup-now'] commands which worked.
  • At that point, with all the envs loaded, I was getting a failed connection to my postgres server with /root/.postgresql/postgresql.crt: Permission denied being cited as the issue.
  • I added PGSSLMODE=require to the env list, and the error went away.
  • Now, I am facing the issue where pg_isready won't see that the server is ready. In order to debug, I grabbed the command that was being executed in the debug logs and ran it interactively in the pod with pg_isready --host=$DB01_HOST --port=$DB01_PORT --dbname=$DB01_NAME --username=$DB01_USER and it worked perfectly.

I suspect that this is a permissions issue with how the commands are being executed, but I'm not entirely sure.

I have the following k8s config:

apiVersion: batch/v1
kind: CronJob
metadata:
 name: postgres-storage-backup
 namespace: mastodon
spec:
 schedule: "30 1 * * *"
 concurrencyPolicy: Forbid
 suspend: false
 successfulJobsHistoryLimit: 1
 failedJobsHistoryLimit: 1
 jobTemplate:
   spec:
     template:
       metadata:
         name: postgres-storage-backup
       spec:
         volumes:
           - name: postgres-completion
             configMap:
               name: postgres-completion
               defaultMode: 0500
         containers:
           - name: postgres-storage-backup
             image: tiredofit/db-backup:4.1.3
             imagePullPolicy: IfNotPresent
             command:
               - /init
               - backup-now
             volumeMounts:
               - name: postgres-completion
                 mountPath: "/script"
             env:
               - name: DEBUG_MODE
                 value: "TRUE"
               - name: PGSSLMODE
                 value: "require"
               - name: MODE
                 value: "MANUAL"
               - name: MANUAL_RUN_FOREVER
                 value: "FALSE"
               - name: CONTAINER_ENABLE_SCHEDULING
                 value: "FALSE"
               - name: CONTAINER_ENABLE_MONITORING
                 value: "FALSE"
               - name: DEFAULT_POST_SCRIPT
                 value: "/script/postgres.sh"
               - name: DEFAULT_BACKUP_LOCATION
                 value: 'S3'
               - name: DEFAULT_S3_BUCKET
                 valueFrom:
                   configMapKeyRef:
                     name: storage-backup
                     key: postgres_bucket
               - name: DEFAULT_S3_KEY_ID
                 valueFrom:
                   configMapKeyRef:
                     name: storage-backup
                     key: DEFAULT_S3_KEY_ID
               - name: DEFAULT_S3_KEY_SECRET
                 valueFrom:
                   configMapKeyRef:
                     name: storage-backup
                     key: DEFAULT_S3_KEY_SECRET
               - name: DEFAULT_S3_REGION
                 valueFrom:
                   configMapKeyRef:
                     name: storage-backup
                     key: DEFAULT_S3_REGION
               - name: DEFAULT_S3_HOST
                 valueFrom:
                   configMapKeyRef:
                     name: storage-backup
                     key: DEFAULT_S3_HOST
               - name: DB01_TYPE
                 value: "pgsql"
               - name: DB01_HOST
                 valueFrom:
                   configMapKeyRef:
                     name: mastodon-env-tf
                     key: DB_HOST
               - name: DB01_PORT
                 valueFrom:
                   configMapKeyRef:
                     name: mastodon-env-tf
                     key: DB_PORT
               - name: DB01_NAME 
                 valueFrom:
                   configMapKeyRef:
                     name: mastodon-env-tf
                     key: DB_NAME
               - name: DB01_USER
                 valueFrom:
                   configMapKeyRef:
                     name: mastodon-env-tf
                     key: DB_USER
               - name: DB01_PASS 
                 valueFrom:
                   configMapKeyRef:
                     name: mastodon-env-tf
                     key: DB_PASS
         restartPolicy: OnFailure
 successfulJobsHistoryLimit: 1
 failedJobsHistoryLimit: 1

What is the expected correct behavior?

pg_isready sees that the server is up and backs it up

Relevant logs and/or screenshots

I've attached the debug logs with everything sensitive scrubbed: private-logs.txt

Environment

  • Image version / tag: tiredofit/db-backup:4.1.3
  • Host OS: k8s 1.30.2-do.0

Possible fixes

I've spent a fair amount of time debugging this, so I felt like there was just a point where it would be best to track my progress with a bug open

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions