Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: database disk image is malformed #143

Closed
linhttbk97 opened this issue Jan 15, 2024 · 10 comments
Closed

Error: database disk image is malformed #143

linhttbk97 opened this issue Jan 15, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@linhttbk97
Copy link

I'm getting error database disk image is malformed when setting cache_size. When I removed cache_size config, the error is gone
Logs: executeQuery result for SELECT id FROM tyCOmBsBQlXDRsoxskoM WHERE _status =? is empty Error: database disk image is malformed at check (sqlite-api.js:1156:11) at Object.eval [as step] (sqlite-api.js:983:14)
VFS: IDBBatchAtomicVFS

here is my code for database initialization

    const module = await SQLESMFactory();
    const sqlite3: SQLiteAPI = SQLite.Factory(module);
    let db: number = -1
    const vfs = new IDBBatchAtomicVFS('starion-db')
    await vfs.isReady;
    await sqlite3.vfs_register(vfs, true);
    db = await sqlite3.open_v2(fileName);
    await sqlite3.exec(db, `PRAGMA cache_size=${32 * 1024};`)
    await sqlite3.exec(db, `PRAGMA page_size=8192;`)
    sqlite3.exec(db, `PRAGMA journal_mode=MEMORY;`)
    await sqlite3.exec(db, `PRAGMA temp_store=MEMORY;`);
@rhashimoto rhashimoto added the not actionable Not suited for Issues label Jan 15, 2024
@rhashimoto
Copy link
Owner

It is extremely unlikely that your database corruption has anything to do with setting the cache size, as that doesn't change anything in the database. I suspect something else previously caused the corruption, such as using journal_mode=MEMORY:

The MEMORY journaling mode stores the rollback journal in volatile RAM. This saves disk I/O but at the expense of database safety and integrity. If the application using SQLite crashes in the middle of a transaction when the MEMORY journaling mode is set, then the database file will very likely go corrupt.

I know that other GitHub projects use Issues as a help desk, but in this project it is meant only for possible bugs and green-lit features in project software. Some conversations not suited for Issues can instead go in Discussions, but please don't use that as a debugging service either.

@rhashimoto rhashimoto closed this as not planned Won't fix, can't repro, duplicate, stale Jan 15, 2024
@linhttbk97
Copy link
Author

I run the demo/file example and import my existing database file. Without PRAGMA cache_size=10000; database is okay. But when cache_size is set, database disk image is malformed. I try to set cache_size = 1000 then it's worked.

@rhashimoto
Copy link
Owner

rhashimoto commented Jan 16, 2024

Can't reproduce, either with demo/file or with the main demo using:

PRAGMA cache_size=10000;
CREATE TABLE IF NOT EXISTS t AS
WITH RECURSIVE
  cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000000)
SELECT x FROM cnt;

PRAGMA integrity_check;

@linhttbk97
Copy link
Author

it's only happened when using import database

@AntonOfTheWoods
Copy link

Actually I'm 95% certain I had that too. As I was creating the DBs also (on the server in python), I stopped doing it at import and did it at creation time, thus fixing my problem, but I definitely had several issues trying to do pragma stuff after importing.

@rhashimoto
Copy link
Owner

rhashimoto commented Jan 17, 2024

@AntonOfTheWoods You saw this after resolving the problems with importing that we determined to cause database file corruption? I might believe that changing the cache size could change how and when an already corrupt database is reported, but I don't see how it would produce corruption.

I have already tried messing with the cache size for the verification phase of the demo/file importing example, and was not able to trigger any errors. Without a reproducible test case or a plausible theory, this doesn't meet the actionable bar for me.

@linhttbk97
Copy link
Author

@rhashimoto give my sample database file a shot with cache_size 10000 demo.db

@AntonOfTheWoods
Copy link

@AntonOfTheWoods You saw this after resolving the problems with importing that we determined to cause database file corruption? I might believe that changing the cache size could change how and when an already corrupt database is reported, but I don't see how it would produce corruption.

I had corruption errors at various points, but had already decided to try and do as little as possible client side. Because I didn't need to do it client side and had already spent 10s of hours trying to get things working, I didn't follow up once I had something working. Again, I'm 95% sure I did see this at some stage though.

Obviously anyone who wants to get help needs to provide a reliable test case, in the form the project supports. I was working with large databases and it took hours to get anything together, given I was using a complicated typescript setup and I had no idea whether that was relevant. The project doesn't seem to currently provide a set of supported, high-level interfaces/abstractions, it's up to us to put something together from the many examples. That's absolutely fine but can make putting together a repro case a reasonable investment in terms of time.

And please know that I am extremely grateful for the wonderful work you put in on this project!

@rhashimoto rhashimoto added bug Something isn't working and removed not actionable Not suited for Issues labels Jan 18, 2024
@rhashimoto
Copy link
Owner

My apologies @linhttbk97. I can reproduce the error with your database by injecting await sqlite3.exec(db, 'PRAGMA cache_size=10000;'); into the file demo verifier above the integrity_check.

After failing, the database can be exported and matches the original so it doesn't seem to be an import error (or if it is it must be something that the export process "fixes"). It happens down to cache_size=3732 for me and the database size is 4847 pages, so failures are seen both above and below the database size. I have no idea how this could happen. Will investigate further.

@rhashimoto
Copy link
Owner

The problem is reported as database corruption, but the persistent data in IndexedDB is not damaged. The issue is that certain IDBBatchAtomicVFS.xRead() calls are not returning data. The reason is that xRead() is asynchronous and uses the Asyncify magic to unwind and rewind the stack. Unfortunately, that magic can trigger growing the WebAssembly memory space, and when this happens it invalidates the buffers that xRead() is supposed to fill.

This doesn't happen all the time; it depends on the WebAssembly memory state. Setting the SQLite cache size high with this particular database apparently creates the conditions for the bug to manifest.

I'll look into an internal fix, but a workaround is to pre-grow the WebAssembly memory space so Asyncify doesn't ever need to. You can do that using the SQLite module (the awaited result of the function imported from the Emscripten .mjs). For example this will allocate and deallocate enough space for 10000 pages of cache plus a little extra temporary space:

module._free(module._malloc(10000 * 4096 + 65536));

Inserting this into the demo/file verifier before invoking any SQL runs without an error.

@rhashimoto rhashimoto reopened this Jan 18, 2024
rhashimoto added a commit that referenced this issue Jan 18, 2024
Fix #143. Handle detached buffers in IDBBatchAtomicVFS.
rhashimoto added a commit that referenced this issue Jan 19, 2024
Fix #143 for Safari (no ArrayBuffer.prototype.detached)
stevensJourney added a commit to powersync-ja/wa-sqlite that referenced this issue Feb 1, 2024
* Work around stack overflow in debug build.

* Handle 64-bit arguments with Emscripten legalization.

* Export all SQLite3 API functions, plus Emscripten utilities.

* Update issue templates

* Update follow-redirects per Dependabot.

* Fix rhashimoto#143. Handle detached buffers in IDBBatchAtomicVFS.

* Bump package version.

* Fix rhashimoto#143 for Safari (no ArrayBuffer.prototype.detached).

* Bump package version.

* updated powersync-sqlite-core to v0.1.6

* updated from upstream

* publish packages on workflow dispatch

---------

Co-authored-by: Roy Hashimoto <roy@shoestringresearch.com>
Co-authored-by: Roy Hashimoto <156154+rhashimoto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants