Skip to content

Fix ptrmap data corruption with pre-initialize autovacuum database#3894

Closed
ddwalias wants to merge 5 commits into
tursodatabase:mainfrom
ddwalias:ptrmap-data-corruption
Closed

Fix ptrmap data corruption with pre-initialize autovacuum database#3894
ddwalias wants to merge 5 commits into
tursodatabase:mainfrom
ddwalias:ptrmap-data-corruption

Conversation

@ddwalias
Copy link
Copy Markdown
Contributor

@ddwalias ddwalias commented Nov 1, 2025

Reproduce:

  1. Use SQLite to initialize an auto-vacuum DB (to produce a live pointer-map page at page 2):

sqlite3 /tmp/av_corruption.db "PRAGMA auto_vacuum=FULL; VACUUM; IF NOT EXISTS __sim_autovacuum_seed__(id INTEGER PRIMARY KEY, value TEXT); INSERT OR IGNORE INTO __sim_autovacuum_seed__ VALUES (1,'seed');"

  1. Let Turso mutate it:

tursodb /tmp/av_corruption.db "CREATE TABLE t(a);" \ for i in $(seq 1 2000); do target/debug/tursodb /tmp/av_corruption.db "INSERT INTO t VALUES ($i);"; done

  1. Back in SQLite, run PRAGMA integrity_check; -> it reports Tree … Failed to read ptrmap key=….. However, Turso’s own PRAGMA integrity_check; still prints ok, so the engine doesn’t notice the damage.

Why the simulator misses it:

The deterministic runs always start from Turso-built databases, and never import a file produced by SQLite, so this path is never exercised.

Reference: #3895

@ddwalias
Copy link
Copy Markdown
Contributor Author

ddwalias commented Nov 1, 2025

The cause seem to be when Turso allocates leaf or overflow pages, it never writes the corresponding pointer‑map entries.

I attempted to fix this issue with this commit, but seems like I introduced a new bug in the process. For this PR, I will focus only on updating the simulator. The fix will be on another PR.

@penberg
Copy link
Copy Markdown
Collaborator

penberg commented Nov 2, 2025

Hey @ddwalias! Good catch! I am disabling autovacuum by default in #3899. Let's have other people review this PR too, but assuming the improvement is good, this is eligible for the data corruption challenge reward, although functionality is disabled now.

@ddwalias ddwalias force-pushed the ptrmap-data-corruption branch 7 times, most recently from ed4a35e to 036d820 Compare November 20, 2025 20:53
@ddwalias ddwalias changed the title simulator: catch ptrmap data corruption with pre-initialize autovacuum database Fix ptrmap data corruption with pre-initialize autovacuum database Nov 20, 2025
@ddwalias ddwalias force-pushed the ptrmap-data-corruption branch from 036d820 to 20c75be Compare November 20, 2025 21:30
@ddwalias
Copy link
Copy Markdown
Contributor Author

I managed to make the integration test passes, however, the simulator is still failing, likely from another issue

@ddwalias ddwalias force-pushed the ptrmap-data-corruption branch from 29bd984 to 20c75be Compare November 23, 2025 15:06
@penberg
Copy link
Copy Markdown
Collaborator

penberg commented Dec 24, 2025

This pull request has been marked as stale due to inactivity. It will be closed in 7 days if no further activity occurs.

@penberg penberg added the Stale label Dec 24, 2025
@penberg
Copy link
Copy Markdown
Collaborator

penberg commented Dec 31, 2025

This pull request has been closed due to inactivity. Please feel free to reopen it if you have further updates.

@penberg penberg closed this Dec 31, 2025
@LeMikaelF LeMikaelF reopened this Jan 1, 2026
Copy link
Copy Markdown

@turso-bot turso-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review @jussisaurio

@LeMikaelF
Copy link
Copy Markdown
Collaborator

Similar to #3830

@LeMikaelF
Copy link
Copy Markdown
Collaborator

@ddwalias we've decided to award you the 800$ Turso Challenge reward, since you did patch the simulator to demonstrate
data corruption.

However, after careful consideration, we chose to go a different route for fixing this bug. Here is our preferred
solution:

  1. Patch Turso to write a 1-page db to disk when the autovacuum mode is
    changed (Write the DB file to disk when the header is modified #4459). This mimics SQLite's behaviour.
  2. Rely on the simulator's existing pragma auto_vacuum support to reproduce the bug.
  3. Open a DB in read-only mode if it's auto-vacuumed (Open database in readonly mode if it is an autovacuum database and the experimental autovacuum feature is not enabled #4457). This should
    have been the current behaviour, since the intent of the --experimental-autovacuum flag was to hide a feature that
    wasn't production ready.

You're more than welcome to help us implement these fixes.

As for the ptrmap fix in the b-tree, it's definitely something we want to implement, see the tracking
issue #2290. Unfortunately, it looks like there's some merge conflicts that
need to be addressed. Apologies for the delay in responding to this PR. You're welcome to open a separate PR with these
changes. If you do so, they should also be tested more extensively. I suggest using a fuzz test. There's already a few
in the code base that you could use as a model.

@LeMikaelF LeMikaelF closed this Jan 5, 2026
@glommer
Copy link
Copy Markdown
Contributor

glommer commented Jan 5, 2026

/tip $800 @ddwalias

@algora-pbc
Copy link
Copy Markdown

algora-pbc Bot commented Jan 5, 2026

Please visit Algora to complete your tip via Stripe.

@algora-pbc
Copy link
Copy Markdown

algora-pbc Bot commented Jan 5, 2026

🎉🎈 @ddwalias has been awarded $800 by Turso Database! 🎈🎊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants