PgHaMigrations

We've documented our learned best practices for applying schema changes without downtime in the post PostgreSQL at Scale: Database Schema Changes Without Downtime on the PayPal Technology Blog. Many of the approaches we take and choices we've made are explained in much greater depth there than in this README.

Internally we apply those best practices to our Rails applications through this gem which updates ActiveRecord migrations to clearly delineate safe and unsafe DDL as well as provide safe alternatives where possible.

Some projects attempt to hide complexity by having code determine the intent and magically do the right series of operations. But we (and by extension this gem) take the approach that it's better to understand exactly what the database is doing so that (particularly long running) operations are not a surprise during your deploy cycle.

Provided functionality:

Migrations
Utilities
Rake Tasks

Installation

Add this line to your application's Gemfile:

gem 'pg_ha_migrations'

And then execute:

$ bundle

Or install it yourself as:

$ gem install pg_ha_migrations

Migration Safety

There are two major classes of concerns we try to handle in the API:

Database safety (e.g., long-held locks)
Application safety (e.g., dropping columns the app uses)

Migration Method Renaming

We rename migration methods with prefixes to explicitly denote their safety level:

safe_*: These methods check for both application and database safety concerns, prefer concurrent operations where available, set low lock timeouts where appropriate, and decompose operations into multiple safe steps.
unsafe_*: Using these methods is a signal that the DDL operation is not necessarily safe for a running application. They include basic safety features like safe lock acquisition and dependent object checking, but otherwise dispatch directly to the native ActiveRecord migration method.
raw_*: These methods are a direct dispatch to the native ActiveRecord migration method.

Calling the original migration methods without a prefix will raise an error.

The API is designed to be explicit yet remain flexible. There may be situations where invoking the unsafe_* method is preferred (or the only option available for definitionally unsafe operations).

While unsafe_* methods were historically (before 2.0) pure wrappers for invoking the native ActiveRecord migration method, there is a class of problems that we can't handle easily without breaking that design rule a bit. For example, dropping a column is unsafe from an application perspective, so we make the application safety concerns explicit by using an unsafe_ prefix. Using unsafe_remove_column calls out the need to audit the application to confirm the migration won't break the application. Because there are no safe alternatives we don't define a safe_remove_column analogue. However there are still conditions we'd like to assert before dropping a column. For example, dropping an unused column that's used in one or more indexes may be safe from an application perspective, but the cascading drop of the index won't use a CONCURRENT operation to drop the dependent indexes and is therefore unsafe from a database perspective.

For unsafe_* migration methods which support checks of this type you can bypass the checks by passing an :allow_dependent_objects key in the method's options hash containing an array of dependent object types you'd like to allow. These checks will run by default, but you can opt-out by setting config.check_for_dependent_objects = false in your configuration initializer.

Disallowed Migration Methods

We disallow the use of unsafe_change_table, as the equivalent operation can be composed with explicit safe_* / unsafe_* methods. If you must use change_table, it is still available as raw_change_table.

Migration Method Arguments

We believe the force: true option to ActiveRecord's create_table method is always unsafe because it's not possible to denote exactly how the current state will change. Therefore we disallow using force: true even when calling unsafe_create_table. This option is enabled by default, but you can opt-out by setting config.allow_force_create_table = true in your configuration initializer.

Rollback

Because we require that "Rollback strategies do not involve reverting the database schema to its previous version", PgHaMigrations does not support ActiveRecord's automatic migration rollback capability.

Instead we write all of our migrations with only an def up method like:

def up
  safe_add_column :table, :column
end

and never use def change. We believe that this is the only safe approach in production environments. For development environments we iterate by recreating the database from scratch every time we make a change.

Transactional DDL

Individual DDL statements in PostgreSQL are transactional by default (as are all Postgres statements). Concurrent index creation and removal are two exceptions: these utility commands manage their own transaction state (and each uses multiple transactions to achieve the desired concurrency).

We disable ActiveRecord's DDL transactions (which wrap the entire migration file in a transaction) by default for the following reasons:

Running multiple DDL statements inside a transaction acquires exclusive locks on all of the modified objects.
Acquired locks are held until the end of the transaction.
Multiple locks creates the possibility of deadlocks.
Increased exposure to long waits:
- Each newly acquired lock has its own timeout applied (so total lock time is additive).
- Safe lock acquisition (which is used in each migration method where locks will be acquired) can issue multiple lock attempts on lock timeouts (with sleep delays between attempts).

Because of the above issues attempting to re-enable transaction migrations forfeits many of the safety guarantees this library provides and may even break certain functionally. If you'd like to experiment with it anyway you can re-enable transactional migrations by adding self.disable_ddl_transaction = false to your migration class definition.

Usage

Unsupported ActiveRecord Features

The following functionality is currently unsupported:

Rollback methods in migrations
Generators
schema.rb

Compatibility Notes

While some features may work with other versions, this gem is currently tested against PostgreSQL 13+ and Partman 4+

Migration Methods

safe_create_table

Safely creates a new table.

safe_create_table :table do |t|
  t.type :column
end

safe_create_enum_type

Safely create a new enum without values.

safe_create_enum_type :enum

Or, safely create the enum with values.

safe_create_enum_type :enum, ["value1", "value2"]

safe_add_enum_value

Safely add a new enum value.

safe_add_enum_value :enum, "value"

unsafe_rename_enum_value

Unsafely change the value of an enum type entry.

unsafe_rename_enum_value(:enum, "old_value", "new_value")

Note: Changing an enum value does not issue any long-running scans or acquire locks on usages of the enum type. Therefore multiple queries within a transaction concurrent with the change may see both the old and new values. To highlight these potential pitfalls no safe_rename_enum_value equivalent exists. Before modifying an enum type entry you should verify that no concurrently executing queries will attempt to write the old value and that read queries understand the new value.

safe_add_column

Safely add a column.

safe_add_column :table, :column, :type

unsafe_add_column

Unsafely add a column, but do so with a lock that is safely acquired.

unsafe_add_column :table, :column, :type

safe_change_column_default

Safely change the default value for a column.

# Constant value:
safe_change_column_default :table, :column, "value"
safe_change_column_default :table, :column, DateTime.new(...)
# Functional expression evaluated at row insert time:
safe_change_column_default :table, :column, -> { "NOW()" }
# Functional expression evaluated at migration time:
safe_change_column_default :table, :column, -> { "'NOW()'" }

Note: On Postgres 11+ adding a column with a constant default value does not rewrite or scan the table (under a lock or otherwise). In that case a migration adding a column with a default should do so in a single operation rather than the two-step safe_add_column followed by safe_change_column_default. We enforce this best practice with the error PgHaMigrations::BestPracticeError, but if your prefer otherwise (or are running in a mixed Postgres version environment), you may opt out by setting config.prefer_single_step_column_addition_with_default = false in your configuration initializer.

safe_make_column_nullable

Safely make the column nullable.

safe_make_column_nullable :table, :column

safe_make_column_not_nullable

Safely make the column not nullable. This method uses a CHECK column IS NOT NULL constraint to validate no values are null before altering the column. If such a constraint exists already, it is re-used, if it does not, a temporary constraint is added. Whether or not the constraint already existed, the constraint will be validated, if necessary, and removed after the column is marked NOT NULL.

safe_make_column_not_nullable :table, :column

Note:

This method may perform a full table scan to validate that no NULL values exist in the column. While no exclusive lock is held for this scan, on large tables the scan may take a long time.

The method runs multiple DDL statements non-transactionally. Validating the constraint can fail. In such cases an INVALID constraint will be left on the table. Calling safe_make_column_not_nullable again is safe.

If you want to avoid a full table scan and have already added and validated a suitable CHECK constraint, consider using safe_make_column_not_nullable_from_check_constraint instead.

unsafe_make_column_not_nullable

Unsafely make a column not nullable.

unsafe_make_column_not_nullable :table, :column

safe_make_column_not_nullable_from_check_constraint

Variant of safe_make_column_not_nullable that safely makes a column NOT NULL using an existing validated CHECK constraint that enforces non-null values for the column. This method is expected to always be fast because it avoids a full table scan.

safe_make_column_not_nullable_from_check_constraint :table, :column, constraint_name: :constraint_name

constraint_name (required): The name of a validated CHECK constraint that enforces column IS NOT NULL.
drop_constraint: (optional, default: true): Whether to drop the constraint after making the column NOT NULL.

You should use safe_make_column_not_nullable when neither a CHECK constraint or a NOT NULL constraint exists already. You should use this method when you already have an equivalent CHECK constraint on the table.

This method will raise an error if the constraint does not exist, is not validated, or does not strictly enforce non-null values for the column.

Note: We do not attempt to catch all possible proofs of column IS NOT NULL by means of an existing constraint; only a constraint with the exact definition column IS NOT NULL will be recognized.

safe_add_index_on_empty_table

Safely add an index on a table with zero rows. This will raise an error if the table contains data.

safe_add_index_on_empty_table :table, :column

safe_add_concurrent_index

Add an index concurrently.

safe_add_concurrent_index :table, :column

Add a composite btree index.

safe_add_concurrent_index :table, [:column1, :column2], name: "index_name", using: :btree

safe_remove_concurrent_index

Safely remove an index. Migrations that contain this statement must also include disable_ddl_transaction!.

safe_remove_concurrent_index :table, :name => :index_name

safe_add_concurrent_partitioned_index

Add an index to a natively partitioned table concurrently, as described in the table partitioning docs:

To avoid long lock times, it is possible to use CREATE INDEX ON ONLY the partitioned table; such an index is marked invalid, and the partitions do not get the index applied automatically. The indexes on partitions can be created individually using CONCURRENTLY, and then attached to the index on the parent using ALTER INDEX .. ATTACH PARTITION. Once indexes for all partitions are attached to the parent index, the parent index is marked valid automatically.

# Assuming this table has partitions child1 and child2, the following indexes will be created:
#   - index_partitioned_table_on_column
#   - index_child1_on_column (attached to index_partitioned_table_on_column)
#   - index_child2_on_column (attached to index_partitioned_table_on_column)
safe_add_concurrent_partitioned_index :partitioned_table, :column

Add a composite index using the hash index type with custom name for the parent index when the parent table contains sub-partitions.

# Assuming this table has partitions child1 and child2, and child1 has sub-partitions sub1 and sub2,
# the following indexes will be created:
#   - custom_name_idx
#   - index_child1_on_column1_column2 (attached to custom_name_idx)
#   - index_sub1_on_column1_column2 (attached to index_child1_on_column1_column2)
#   - index_sub2_on_column1_column2 (attached to index_child1_on_column1_column2)
#   - index_child2_on_column1_column2 (attached to custom_name_idx)
safe_add_concurrent_partitioned_index :partitioned_table, [:column1, :column2], name: "custom_name_idx", using: :hash

Note: This method runs multiple DDL statements non-transactionally. Creating or attaching an index on a child table could fail. In such cases an exception will be raised, and an INVALID index will be left on the parent table.

safe_add_unvalidated_check_constraint

Safely add a CHECK constraint. The constraint will not be immediately validated on existing rows to avoid a full table scan while holding an exclusive lock. After adding the constraint, you'll need to use safe_validate_check_constraint to validate existing rows.

safe_add_unvalidated_check_constraint :table, "column LIKE 'example%'", name: :constraint_table_on_column_like_example

safe_validate_check_constraint

Safely validate (without acquiring an exclusive lock) existing rows for a newly added but as-yet unvalidated CHECK constraint.

safe_validate_check_constraint :table, name: :constraint_table_on_column_like_example

safe_rename_constraint

Safely rename any (not just CHECK) constraint.

safe_rename_constraint :table, from: :constraint_table_on_column_like_typo, to: :constraint_table_on_column_like_example

unsafe_remove_constraint

Drop any (not just CHECK) constraint.

unsafe_remove_constraint :table, name: :constraint_table_on_column_like_example

safe_create_partitioned_table

Safely create a new partitioned table using declaritive partitioning.

# list partitioned table using single column as partition key
safe_create_partitioned_table :table, type: :list, partition_key: :example_column do |t|
  t.text :example_column, null: false
end

# range partitioned table using multiple columns as partition key
safe_create_partitioned_table :table, type: :range, partition_key: [:example_column_a, :example_column_b] do |t|
  t.integer :example_column_a, null: false
  t.integer :example_column_b, null: false
end

# hash partitioned table using expression as partition key
safe_create_partitioned_table :table, :type: :hash, partition_key: ->{ "(example_column::date)" } do |t|
  t.datetime :example_column, null: false
end

The identifier column type is bigserial by default. This can be overridden, as you would in safe_create_table, by setting the id argument:

safe_create_partitioned_table :table, id: :serial, type: :range, partition_key: :example_column do |t|
  t.date :example_column, null: false
end

In PostgreSQL 11+, primary key constraints are supported on partitioned tables given the partition key is included. On supported versions, the primary key is inferred by default (see available options). This functionality can be overridden by setting the infer_primary_key argument.

# primary key will be (id, example_column)
safe_create_partitioned_table :table, type: :range, partition_key: :example_column do |t|
  t.date :example_column, null: false
end

# primary key will not be created
safe_create_partitioned_table :table, type: :range, partition_key: :example_column, infer_primary_key: false do |t|
  t.date :example_column, null: false
end

safe_partman_create_parent

Safely configure a partitioned table to be managed by pg_partman.

This method calls the create_parent partman function with some reasonable defaults and a subset of user-defined overrides.

The first (and only) positional argument maps to p_parent_table in the create_parent function.

The rest are keyword args with the following mappings:

partition_key -> p_control. Required: true
interval -> p_interval. Required: true
template_table -> p_template_table. Required: false. Partman will create a template table if not defined.
premake -> p_premake. Required: false. Partman defaults to 4.
start_partition -> p_start_partition. Required: false. Partman defaults to the current timestamp.

Note: We have chosen to require PostgreSQL 11+ and hardcode p_type to native (range in the case of Partman 5) for simplicity, as previous PostgreSQL versions are end-of-life.

Additionally, this method allows you to configure a subset of attributes on the record stored in the part_config table. These options are delegated to the unsafe_partman_update_config method to update the record:

infinite_time_partitions. Partman defaults this to false but we default to true
inherit_privileges. Partman defaults this to false but we default to true
retention. Partman defaults this to null
retention_keep_table. Partman defaults this to true

With only the required args:

safe_create_partitioned_table :table, type: :range, partition_key: :created_at do |t|
  t.timestamps null: false
end

safe_partman_create_parent :table, partition_key: :created_at, interval: "1 week"

With custom overrides:

safe_create_partitioned_table :table, type: :range, partition_key: :created_at do |t|
  t.timestamps null: false
  t.text :some_column
end

# Partman will reference the template table to create unique indexes on child tables
safe_create_table :table_template, id: false do |t|
  t.text :some_column, index: {unique: true}
end

safe_partman_create_parent :table,
  partition_key: :created_at,
  interval: "1 week",
  template_table: :table_template,
  premake: 10,
  start_partition: Time.current + 1.month,
  infinite_time_partitions: false,
  inherit_privileges: false,
  retention: "60 days",
  retention_keep_table: false

safe_partman_update_config

There are some partitioning options that cannot be set in the call to create_parent and are only available in the part_config table. As mentioned previously, you can specify these args in the call to safe_partman_create_parent which will be delegated to this method. Calling this method directly will be useful if you need to modify your partitioned table after the fact.

Allowed keyword args:

infinite_time_partitions
inherit_privileges
premake
retention
retention_keep_table

Note: If inherit_privileges will change then safe_partman_reapply_privileges will be automatically called to ensure permissions are propagated to existing child partitions.

safe_partman_update_config :table,
  infinite_time_partitions: false,
  inherit_privileges: false,
  premake: 10

unsafe_partman_update_config

We have chosen to flag the use of retention and retention_keep_table as an unsafe operation. While we recognize that these options are useful, changing these values fits in the same category as drop_table and rename_table, and is therefore unsafe from an application perspective. If you wish to change these options, you must use this method.

unsafe_partman_update_config :table,
  retention: "60 days",
  retention_keep_table: false

safe_partman_reapply_privileges

If your partitioned table is configured with inherit_privileges set to true, use this method after granting new roles / privileges on the parent table to ensure permissions are propagated to existing child partitions.

safe_partman_reapply_privileges :table

unsafe_partman_standardize_partition_naming

This method provides functionality to standardize existing Partman 4 tables such that naming is consistent with Partman 5. The logic follows the guidelines in the Partman upgrade docs.

Technically, only weekly and quarterly partitioned tables need to be standardized prior to the upgrade. However, Partman 5 changes the default datetime_string that is used for all intervals (YYYYMMDD and YYYYMMDD_HH24MISS). Compare that to the Partman 4 logic for datetime_string. So, this method supports standardization for all Partman 4 intervals.

Note: This method is safe from a database perspective, but is only safe from an application perspective if child tables are not directly referenced (child tables are renamed during this operation)

unsafe_partman_standardize_partition_naming :table

This method uses a default statement timeout of 1 second. If the target table has many partitions (hundreds of thousands), you may need to increase the statement timeout for the operation to succeed.

unsafe_partman_standardize_partition_naming :table, statement_timeout: 2

Utilities

safely_acquire_lock_for_table

Acquires a lock (in ACCESS EXCLUSIVE mode by default) on a table using the following algorithm:

Verify that no long-running queries are using the table.
- If long-running queries are currently using the table, sleep PgHaMigrations::LOCK_TIMEOUT_SECONDS and check again.
If no long-running queries are currently using the table, optimistically attempt to lock the table (with a timeout of PgHaMigrations::LOCK_TIMEOUT_SECONDS).
- If the lock is not acquired, sleep PgHaMigrations::LOCK_FAILURE_RETRY_DELAY_MULTLIPLIER * PgHaMigrations::LOCK_TIMEOUT_SECONDS, and start again at step 1.
If the lock is acquired, proceed to run the given block.

safely_acquire_lock_for_table(:table) do
  ...
end

Safely acquire a lock on a table in SHARE mode.

safely_acquire_lock_for_table(:table, mode: :share) do
  ...
end

Safely acquire a lock on multiple tables in EXCLUSIVE mode.

safely_acquire_lock_for_table(:table_a, :table_b, mode: :exclusive) do
  ...
end

Note: We enforce that only one set of tables can be locked at a time. Attempting to acquire a nested lock on a different set of tables will result in an error.

adjust_lock_timeout

Adjust lock timeout.

adjust_lock_timeout(seconds) do
  ...
end

adjust_statement_timeout

Adjust statement timeout.

adjust_statement_timeout(seconds) do
  ...
end

safe_set_maintenance_work_mem_gb

Set maintenance work mem.

safe_set_maintenance_work_mem_gb 1

ensure_small_table!

Ensure a table on disk is below the default threshold (10 megabytes). This will raise an error if the table is too large.

ensure_small_table! :table

Ensure a table on disk is below a custom threshold and is empty. This will raise an error if the table is too large and/or contains data.

ensure_small_table! :table, empty: true, threshold: 100.megabytes

Configuration

The gem can be configured in an initializer.

PgHaMigrations.configure do |config|
  # ...
end

Available options

disable_default_migration_methods: If true, the default implementations of DDL changes in ActiveRecord::Migration and the PostgreSQL adapter will be overridden by implementations that raise a PgHaMigrations::UnsafeMigrationError. Default: true
check_for_dependent_objects: If true, some unsafe_* migration methods will raise a PgHaMigrations::UnsafeMigrationError if any dependent objects exist. Default: true
prefer_single_step_column_addition_with_default: If true, raise an error when adding a column and separately setting a constant default value for that column in the same migration. Default: true
allow_force_create_table: If false, the force: true option to ActiveRecord's create_table method is disallowed. Default: false
infer_primary_key_on_partitioned_tables: If true, the primary key for partitioned tables will be inferred on PostgreSQL 11+ databases (identifier column + partition key columns). Default: true
partman_5_compatibility_mode: If true, safe_partman_create_parent will raise an error if the user provides an interval that is not supported by Partman 5. If the interval is supported, the method will ensure table name suffixes match the Partman 5 format (YYYYMMDD, YYYYMMDD_HTH24MISS). Default: false

Rake Tasks

Use this to check for blocking transactions before migrating.

$ bundle exec rake pg_ha_migrations:check_blocking_database_transactions

This rake task expects that you already have a connection open to your database. We suggest that you add another rake task to open the connection and then add that as a prerequisite for pg_ha_migrations:check_blocking_database_transactions.

namespace :db do
  desc "Establish a database connection"
  task :establish_connection do
    ActiveRecord::Base.establish_connection
  end
end

Rake::Task["pg_ha_migrations:check_blocking_database_transactions"].enhance ["db:establish_connection"]

Development

After checking out the repo, run bin/setup to install dependencies and start a postgres docker container. Then, run bundle exec rspec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment. This project uses Appraisal to test against multiple versions of ActiveRecord; you can run the tests against all supported version with bundle exec appraisal rspec.

Warning: If you rebuild the Docker container without using docker-compose build (or the --build flag), it will not respect the PGVERSION environment variable that you've set if image layers from a different version exist. The Dockerfile uses a build-time argument that's only evaluated during the initial build. To change the Postgres version, you should explicitly provide the build argument: docker-compose build --build-arg PGVERSION=15. Using bin/setup handles this for you.

Warning: The Postgres Dockerfile automatically creates an anonymous volume for the data directory. When changing the specified PGVERSION environment variable this volume must be reset using --renew-anon-volumes or booting Postgres will fail. Using bin/setup handles this for you.

Running tests will automatically create a test database in the locally running Postgres server. You can find the connection parameters in spec/spec_helper.rb, but setting the environment variables PGHOST, PGPORT, PGUSER, and PGPASSWORD will override the defaults.

To install this gem onto your local machine, run bundle exec rake install.

To release a new version, update the version number in version.rb, commit the change, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Note: If while releasing the gem you get the error Your rubygems.org credentials aren't set. Run `gem push` to set them. you can more simply run gem signin.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/braintreeps/pg_ha_migrations. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the PgHaMigrations project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 178 Commits
.github/workflows		.github/workflows
bin		bin
gemfiles		gemfiles
lib		lib
spec		spec
.gitignore		.gitignore
.pryrc		.pryrc
.rspec		.rspec
.ruby-version		.ruby-version
Appraisals		Appraisals
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
Gemfile		Gemfile
LICENSE.txt		LICENSE.txt
README.md		README.md
Rakefile		Rakefile
docker-compose.yml		docker-compose.yml
pg_ha_migrations.gemspec		pg_ha_migrations.gemspec

License

braintree/pg_ha_migrations

Folders and files

Latest commit

History

Repository files navigation

PgHaMigrations

Installation

Migration Safety

Migration Method Renaming

Disallowed Migration Methods

Migration Method Arguments

Rollback

Transactional DDL

Usage

Unsupported ActiveRecord Features

Compatibility Notes

Migration Methods

safe_create_table

safe_create_enum_type

safe_add_enum_value

unsafe_rename_enum_value

safe_add_column

unsafe_add_column

safe_change_column_default

safe_make_column_nullable

safe_make_column_not_nullable

unsafe_make_column_not_nullable

safe_make_column_not_nullable_from_check_constraint

safe_add_index_on_empty_table

safe_add_concurrent_index

safe_remove_concurrent_index

safe_add_concurrent_partitioned_index

safe_add_unvalidated_check_constraint

safe_validate_check_constraint

safe_rename_constraint

unsafe_remove_constraint

safe_create_partitioned_table

safe_partman_create_parent

safe_partman_update_config

unsafe_partman_update_config

safe_partman_reapply_privileges

unsafe_partman_standardize_partition_naming

Utilities

safely_acquire_lock_for_table

adjust_lock_timeout

adjust_statement_timeout

safe_set_maintenance_work_mem_gb

ensure_small_table!

Configuration

Available options

Rake Tasks

Development

Contributing

License

Code of Conduct

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 19

Uh oh!

Languages

Packages