Skip to content

Move DB initialization scripts for postgres and redis into service files. #967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

featheredtoast
Copy link
Member

@featheredtoast featheredtoast commented May 18, 2025

This resolves a race condition with unconfigured images attempting to bring up DBs for the first time. This does not affect fully bootstrapped images.

Currently, all jobs start at boot - this includes postgres.

Issue with the current is - postgres starts and adds the corresponding .s/.pid files to /var/run/postgres.

Simultaneously, the unicorn job gets started, checks to see if postgres is running (it is already at this point from boot), and runs install_postgres.

Inside the install_postgres script, we mount the shared postgres folder and remove .s/.pid files -- after postgres has already been started. In this case, we remove the (in-use) .s and .pid files.

Subsequent unicorn tasks fail, erroring out the service and forcing it into a restart loop. Since postgres never restarts, it never regenerates the .s/.pid files, and unicorn can never run successfully.

This proposal moves install_postgres into the postgres job file, eliminating the race condition. Since they are part of the same service, install_postgres will always run before starting postgres - it will no longer be able to remove valid .s and .pid files.

Redis has a similar race condition with the creation of its data folder. This isn't as disastrous as the redis service restarts until the folder exists from unicorn run, but it provides better reasoning about the running services.

Run create_db script if configured to directly from the postgres service.
Wait more smartly via polling pg_isready rather than a single long sleep.

Add early exit from unicorn boot scripts for more dependable service restarts - exiting early ensures all will restart.

@featheredtoast featheredtoast force-pushed the fix-cold-boot branch 2 times, most recently from 88d4492 to 0118862 Compare May 19, 2025 14:55
…les.

This resolves a race condition with unconfigured images attempting to bring up
DBs for the first time. This does not affect fully bootstrapped images.

Currently, all jobs start at boot - this includes postgres.

Issue with the current is - postgres starts and adds the corresponding .s/.pid
files to /var/run/postgres.

Simultaneously, the unicorn job gets started, checks to see if postgres is
running (it is already at this point from boot), and runs install_postgres.

Inside the install_postgres script, we mount the shared postgres folder and
remove .s/.pid files -- after postgres has already been started. In this case,
we remove the (in-use) .s and .pid files.

Subsequent unicorn tasks fail, erroring out the service and forcing it into
a restart loop. Since postgres never restarts, it never regenerates the .s/.pid
files, and unicorn can never run successfully.

This proposal moves install_postgres into the postgres job file, eliminating the
race condition. Since they are part of the same service, install_postgres will
always run before starting postgres - it will no longer be able to remove valid
.s and .pid files.

Redis has a similar race condition with the creation of its data folder. This
isn't as disastrous as the redis service restarts until the folder exists from
unicorn run, but it provides better reasoning about the running services.

Add early exit from unicorn boot scripts to properly retry migrate as well.

Use pg_isready to check if pg is ready directly in create_db.
Merge the ready check into create_db.

Run create_db in a subshell on postgres job start, rather than in unicorn script.

remove postgres-config call
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant