Description
This concept came about from a discussion between myself and @iamjpotts.
Motivation
Handling of constraint errors is a pretty typical task in any database application.
As a simple example, a platform with user registration will likely want to ensure that usernames are unique so they can unambiguously refer to one user. This is pretty easy to enforce in SQL with a UNIQUE
constraint, which usually has the benefit of automatically creating an index over the column as well:
CREATE TABLE users (
user_id UUID PRIMARY KEY,
-- Of course, you'd likely want this to be case-insensitive, or just always fold to lowercase.
username TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
);
A naive implementation may perform a SELECT
query before attempting to insert a user record, to check if the username is already taken:
let username_taken: bool = sqlx::query_scalar("SELECT EXISTS(SELECT 1 FROM users WHERE username ILIKE ?)")
.bind(&username)
.fetch_one(&pool)
.await?;
if username_taken {
return Err("username taken");
}
let user_id: Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id")
.bind(&username)
.bind(&password_hash)
.fetch_one(&pool)
.await?;
However, this takes two round-trips to the database, increasing response latency. The astute reader may have also noticed that the above code has a time-of-check to time-of-use (TOCTOU) bug because it does not perform the two queries in a transaction; another call to this function may race this one and create a user record with the same username, and so the insert may fail with an error which is likely not going to be surfaced to the user in a useful manner (e.g. a 500 Internal Server Error
instead of a 400 Bad Request
) since it wasn't anticipated by the developer.
You could fix the bug by performing the check and the insert in a transaction, but you can also eliminate a round-trip by just attempting the insert and using the unique constraint (which would have fired a violation error anyway in the above situation) to detect the collision. SQLx supports and encourages this use-case; however, this is not currently very ergonomic:
let user_id: Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id")
.bind(&username)
.bind(&password_hash)
.fetch_one(&pool)
.await
.map_err(|e| match e {
sqlx::Error::DatabaseError(dbe) if dbe.constraint() == Some("users_username_key") => {
MyError::from("username taken")
},
other => other.into(),
})?;
In real Launchbadge projects, we've used an extension trait to make this more ergonomic, similar to what's demonstrated in realworld-axum-sqlx
: https://github.com/launchbadge/realworld-axum-sqlx/blob/main/src/http/error.rs#L176-L184
However, this still leads to matching on magic strings most of the time, and can lead to forgetting to cover newly added constraints.
Design
I propose to add what I'm calling a "checked derive", which takes an enum as input and checks at compile-time that it exhaustively represents the constraints for a given table:
#[derive(sqlx::ConstraintEnum)]
#[sqlx(table_name = "users")]
pub enum UsersConstraints {
/// The `UNIQUE` constraint of the `username` column was violated.
#[sqlx(constraint = "users_username_key")]
UsernameUnique,
}
Like any derive the primary function would be to generate trait impls. This would generate a FromStr
impl that converts the constraint name string to a given variant, as well as an impl of the ConstraintEnum
trait for type safety:
pub trait ConstraintEnum: FromStr {
/// Get the name of the constraint for the given enum variant.
fn name(&self) -> &'static str;
}
This impl, would then interact with an extension trait and intermediate error type:
pub trait ResultExt<T> {
/// If this error represents one of the constraints covered by the given `ConstraintEnum`, map it to another error.
fn on_constraint<C: ConstraintEnum, E>(self, on_constraint: FnOnce(C) -> E) -> Result<T, MaybeConstraintError<E>>;
}
// Implements at least: `std::error::Error`, `Debug`
pub enum MaybeConstraintError<E> {
ConstraintError(E),
Other(sqlx::Error),
}
// I'm *hoping* coherence allows at least some form of this, but I'm not currently certain.
impl<E1, E2> From<MaybeConstraintError<E1>> for E2 where E1: Into<E2>, sqlx::Error: Into<E2> {
// ...
}
impl<T> ResultExt for Result<T, sqlx::Error> {
fn on_constraint<C: ConstraintEnum, E>(self, on_constraint: FnOnce(C) -> E) -> Result<T, MaybeConstraintError<E>> {
// ..
}
}
This would then allow the above INSERT
to be modeled like so:
let user_id: Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id")
.bind(&username)
.bind(&password_hash)
.fetch_one(&pool)
.await
// The above `From` impl should allow the `?` here to work, assuming `MyError` implements `From<sqlx::Error>`
.on_constraint(|c: UserConstraints| match c {
UserConstraints::UsernameUnique => MyError::new(StatusCode::BAD_REQUEST, "username taken"),
})?;
Compile-Time Checking
Similar to the query macros, if DATABASE_URL
is set and offline-mode is not enabled, the derive can check the constraints on the given table at compile-time (e.g. using INFORMATION_SCHEMA.TABLE_CONSTRAINTS
or the equivalent in pg_catalog
) and ensure that the enum exhaustively lists the constraints.
This can help ensure, for example, that a new constraint being added is covered by any code it might affect:
-- This is effectively equivalent to changing `username`'s type to `varchar(63)`,
-- however that raises a different error code than a constraint violation.
ALTER TABLE users ADD CONSTRAINT user_username_length CHECK (LENGTH(username) < 64));
-- Enforcing a minimum length is probably better done in your application code where you can change it more easily
-- without having to modify existing user records. However, checking that it's not empty is a good sanity check.
ALTER TABLE users ADD CONSTRAINT user_username_nonempty CHECK (LENGTH(username) > 0);
If the above were executed as a new migration, for example, the ConstraintEnum
derive would then emit a compiler error calling out the uncovered constraints and force the developer to add coverage for them.
For convenience, I would maybe add a subcommand to sqlx-cli
to automatically generate a ConstraintEnum
definition for a given table, which can use the same logic to look up the constraints so the user doesn't have to manually. This would especially be useful if adapting SQLx to an existing database.
I would make this functionality optional so that users who don't wish to use compile-time checks can still benefit from the API improvement. Perhaps as a checked-derive
feature that extends the derive
feature added in #3113 (cc @saiintbrisson).
Without DATABASE_URL
set and with the checked-derive
feature disabled, the ConstraintEnum
would work as a simple dumb derive. This would also make it possible to still use the query macros, but only check constraint enums, e.g., in CI or a pre-commit hook, by not enabling the feature in normal development (though that's likely to be flaky given Cargo's feature unification; a dependency could enable the feature non-optionally which would force it on everywhere). It should also obey SQLX_OFFLINE
.
Additionally, I would make it possible to explicitly mark constraints as ignored, either as a temporary workaround or to ignore constraints that are not expected to fire in normal operation:
#[derive(sqlx::ConstraintEnum)]
#[sqlx(table_name = "users")]
// For example, these should realistically be checked before even touching the database.
#[sqlx(ignore("user_username_length", "user_username_nonempty")]
pub enum UsersConstraints {
/// The `UNIQUE` constraint of the `username` column was violated.
#[sqlx(constraint = "users_username_key")]
UsernameUnique,
}
Or to mark a constraint enum as a non-exhaustive subset (only check that the declared constraints actually exist):
#[derive(sqlx::ConstraintEnum)]
#[sqlx(table_name = "users")]
#[sqlx(non_exhaustive)]
pub enum UsersConstraints {
/// The `UNIQUE` constraint of the `username` column was violated.
#[sqlx(constraint = "users_username_key")]
UsernameUnique,
}
Bikeshedding: this should be semantically distinct from #[non_exhaustive]
and should probably not use the same identifier to avoid confusion. But what should it be instead? check_declared_only
?
Alternatives
Automatically generated enum
Instead, SQLx could automatically generate the full enum:
sqlx::constraint_enum!(pub UserConstraints, "users");
However, this has a number of disadvantages:
- It would require either a compile-time connection to the database, or use the same offline data mechanism as the query macros.
- It would be impossible to use as a simple derive without bringing in at least some query macro scaffolding (re. feat: new derive feature flag #3113).
- The user cannot reason about the output of the macro without knowing what constraints exist on the given table.
- The query macros generate predictable output from their input (assuming output columns are explicitly declared)
and properly formatted query macro invocations should be perfectly readable and easily reasoned about on their own
without an IDE expanding the macro.
- The query macros generate predictable output from their input (assuming output columns are explicitly declared)
- Controlling the output of the macro would require a dedicated DSL for:
- Ignoring constraints you don't care about;
- Renaming variants for clarity;
- notice that the
UNIQUE
constraint onusers.username
was automatically namedusers_username_key
, which doesn't tell you a lot at a glance
- notice that the
- Adding documentation comments to explain what each constraint violation means;
- Adding other derives, such as
thiserror::Error
to turn theConstraintEnum
into an error type in its own right.- Supporting this use-case in this format would be more complex than just writing out the enum definition!
- Go-to-definition would likely not be very useful, and IDE support in general would be very spotty and results would be slow.