sqlite transaction creation is not error or drop-safe.

### I have found these related issues/pull requests

None

### Description

This is sort of two bugs in one. But the causes are very related (partially the same) and the simpler error-handling bug actually helps demonstrate the problem while being much simpler and more reliable to trigger. So I have decided to report them together. I am also reporting them from the perspective of sqlite because that is where we encountered them and the code that we have explored most thoroughly. However the root cause is in the database independent `Transaction` object and likely affects all databases, and if databases are not affected it is likely more by chance than by intention.

## Drop Safety

With very unfortunate drop-timing or an error during transaction setup code. `sqlx_sqlite::SqliteTransactionManager::begin` can leave a connection with an open transaction. If this connection is not specially cleaned up it can result in future calls that try to open a transaction such as `SqlitePool::begin` to fail with:

> attempted to call begin_with at non-zero transaction depth

For many people using sqlite this is a major issue. Because sqlite doesn't support concurrent writers it is common to use a single write connection. If this connection is "corrupted" by being left with an open transaction all attempts to use a write transaction in the application will fail. We were lucky in that we noticed fairly quickly, if this happened at night we likely would have had hours of downtime.

---

https://github.com/launchbadge/sqlx/blob/f7ef1ed1e99bd2fd6f29a81b103235517fcc2731/sqlx-sqlite/src/transaction.rs#L14-L25

In `SqliteTransactionManager::begin` with a custom statement it is possible that the future is dropped or an error is raised between when the transaction is started (at some point during the execution of line 16) and when line 19 completes (at which point it will be wrapped in a `Transaction` object that cleans it up). In this case the `Transaction` object would never be constructed and the transaction would never be rolled back.

https://github.com/launchbadge/sqlx/blob/f7ef1ed1e99bd2fd6f29a81b103235517fcc2731/sqlx-core/src/transaction.rs#L106-L111

It is also quite possible that this same error could occur without a custom statement, but I have not yet looked into the implementation of `ConnectionWorker::begin` to see if it has any suspension points after the transaction is started. Either way the code structure in `Transaction` is very error prone. If the future is `Drop`ed between the transaction being started (somewhere inside `DB::TransactionManager::begin`) and the `Transaction` object being constructed the connection will unexpectedly be left in an "in-transaction" state. This could very likely affect all database implementations as the main "bug" is in the database independent `Transaction` object. Our use case is just triggering it via the `SqliteTransactionManager` implementation.

I can think of two simple ways to resolve this:
1. Build the `Transaction` object immediately, then call `DB::TransactionManager::begin` on the connection inside of the transaction. The `impl Drop for Transaction` will then need to handle the case where the transaction was not actually started. (This could be possibly be done by changing `done` to be a try state where it starts "unknown" then switches to "open" after the transaction is started then moves to "closed" after rollback or commit.)
2. Create a new `StartingTransaction` object that has the same "close transaction if open" logic described above, but once `DB::TransactionManager::begin` move the connection from that object into the `Transaction` object (in a Drop and panic-safe way). This avoids `Transaction` needing to handle the non-transaction case which should be very exceptional and would be good to avoid extra database calls for (especially for networked databases).

## Error Safety

A bug with basically the same cause is error safety. Any error raised during transaction setup after the connection is put into in a transaction state (whether an actual underlying sqlite transaction or a `ConnectionWorker` `transaction_depth > 0` state) will not result in the connection being closed.

Some errors are handled, for example if the default or custom statement fails the internal `transaction_depth` counter is not incremented. Since this error also means a transaction was not started this leaves sqlx in a good "no transaction" state.

https://github.com/launchbadge/sqlx/blob/f7ef1ed1e99bd2fd6f29a81b103235517fcc2731/sqlx-sqlite/src/connection/worker.rs#L226-L231

However other errors are not properly handled. For example there is an explicit check to ensure that the provided custom statement creates a transaction.

https://github.com/launchbadge/sqlx/blob/f7ef1ed1e99bd2fd6f29a81b103235517fcc2731/sqlx-sqlite/src/transaction.rs#L18-L22

If this check fails it raises an error that avoids the `Transaction` object being created. This then means that the transaction would not be closed. In theory this isn't an error because there is no transaction. However for sqlite this is an issue because the `ConnectionWorker` still believes it is in one and will fail future attempts at opening transactions.

I can think of two improvements. Probably it is necessary to do both of them:
1. Move the "is in connection" check to `ConnectionWorker` so that it can avoid incrementing `transaction_depth` if the command succeeded but did not start a transaction. However this approach is fragile as it does not other possible errors which could occur after a transaction was opened.
2. Basically the same solution as the drop problem, build some object with a `Drop` that will clean up the transaction no matter why it is done, including any error states. In this case the `ConnectionWorker` may still need to be updated to track `transaction_depth` properly but it should also be robust to other errors.

## Fixing

I can contribute a patch if we decide on the approach. `DB::TransactionManager::get_transaction_depth` should provided the required information for determining if the transaction needs to be rolled back without needing database-specific logic. Additionally the `ConnectionWorker` will need to be re-thought to avoid losing sync when the statement doesn't actually open a transaction.

### Reproduction steps

Unfortunately I don't have a concrete way to reproduce the drop case. (It can probably be done by figuring out the exact number of times to poll the future, but I didn't feel like doing that as the reproducer would be very fragile) but I can show the error case which is a very similar error path.

```rust
#[sqlx::test]
async fn write_tx_recover(pool: SqlitePool) -> anyhow::Result<()> {
    let mut conn = pool.acquire().await?;

    dbg!(
        Transaction::begin(
            MaybePoolConnection::<Sqlite>::Connection(&mut conn),
            Some("SELECT 1".into()),
        )
        .await
    );

    dbg!(
        Transaction::begin(
            MaybePoolConnection::<Sqlite>::Connection(&mut conn),
            Some("THIS IS NEVER RUN".into()),
        )
        .await
    )?;

    Ok(())
}
```

Output:
```
---- db::sqlite::write_tx_recover stdout ----
[api-server/src/db/sqlite.rs:291:5] Transaction::begin(MaybePoolConnection::<Sqlite>::Connection(&mut conn),
Some("SELECT 1".into()),).await = Err(
    BeginFailed,
)
[api-server/src/db/sqlite.rs:299:5] Transaction::begin(MaybePoolConnection::<Sqlite>::Connection(&mut conn),
Some("adsjlfjadslfjaslfejl".into()),).await = Err(
    InvalidSavePointStatement,
)
Error: attempted to call begin_with at non-zero transaction depth
```

This is a bit of a strange reproducer because the `SELECT 1` doesn't actually start an sqlite transaction. However it exercises the same code and bug as the drop case. In this case the `ConnectionWorker` increments `transaction_depth`, but then an error is returned before the `Transaction` object is created. The transaction is never cleaned up (because that was the responsibility of the `Transaction` which was never constructed). The second call then fails because `ConnectionWorker` still believes that it is in a transaction. The second command is garbage because it is never run, `ConnectionWorker` returns an error before even getting to sqlite.

This could be modified to trigger the `Drop` bug. 
1. Update the commands to be `BEGIN IMMEDIATE`.
2. Instead of `await`-ing the first future poll it `N` times then drop it. Where `N` is enough times to start the transaction in sqlite (and/or `ConnectionWorker`) but not enough to complete the future (and build the `Transaction` object.

### SQLx version

0.8.5 (and code checked against current `main` branch)

### Enabled SQLx features

"sqlite", "chrono", "uuid", "runtime-tokio-rustls", "macros", "migrate"

### Database server and version

3.48.0 2025-01-14 11:05:00 d2fe6b05f38d9d7cd78c5d252e99ac59f1aea071d669830c1ffe4e8966e84010 (64-bit)

### Operating system

Linux

### Rust version

1.87.0

	async fn begin(conn: &mut SqliteConnection, statement: Option<SqlStr>) -> Result<(), Error> {
	let is_custom_statement = statement.is_some();
	conn.worker.begin(statement).await?;
	if is_custom_statement {
	// Check that custom statement actually put the connection into a transaction.
	let mut handle = conn.lock_handle().await?;
	if !handle.in_transaction() {
	return Err(Error::BeginFailed);
	}
	}
	Ok(())
	}

	DB::TransactionManager::begin(&mut conn, statement).await?;

	Ok(Self {
	connection: conn,
	open: true,
	})

	let res =
	conn.handle
	.exec(statement.as_str())
	.map(\|_\| {
	shared.transaction_depth.fetch_add(1, Ordering::Release);
	});

	// Check that custom statement actually put the connection into a transaction.
	let mut handle = conn.lock_handle().await?;
	if !handle.in_transaction() {
	return Err(Error::BeginFailed);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqlite transaction creation is not error or drop-safe. #3932

I have found these related issues/pull requests

Description

Drop Safety

Error Safety

Fixing

Reproduction steps

SQLx version

Enabled SQLx features

Database server and version

Operating system

Rust version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sqlite transaction creation is not error or drop-safe. #3932

Description

I have found these related issues/pull requests

Description

Drop Safety

Error Safety

Fixing

Reproduction steps

SQLx version

Enabled SQLx features

Database server and version

Operating system

Rust version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions