Skip to content

IDBMirrorVFS: Fix database corruption caused by unfilled blocks #259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 11, 2025

Conversation

Spxg
Copy link
Contributor

@Spxg Spxg commented Apr 5, 2025

See #258 (comment)

Steps to reproduce:

  1. Open https://rhashimoto.github.io/wa-sqlite/demo/?build=asyncify&config=IDBMirrorVFS
  2. Fill in the following sql statement:
PRAGMA page_size=65536;

CREATE TABLE IF NOT EXISTS large_blobs  (
    id INTEGER PRIMARY KEY,
    data BLOB
);

INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
INSERT INTO large_blobs(data) VALUES (randomblob(100 * 1024 * 1024));
  1. After the run is complete, refresh the interface and run again
  2. A "database disk image is malformed" error occurs

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Spxg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses a database corruption issue in the IDBMirrorVFS, as described in issue #258. The problem arises when SQLite writes are not sequential, leading to unfilled blocks in the database file. This PR introduces a fix that fills these unwritten blocks with empty data to prevent the 'database disk image is malformed' error. The change modifies src/examples/IDBMirrorVFS.js to include a loop that iterates through the unwritten blocks and fills them with a new Uint8Array of the appropriate blockSize before writing the actual data.

Highlights

  • Bug Fix: Addresses a database corruption issue in IDBMirrorVFS caused by non-sequential writes.
  • Block Filling: Implements a mechanism to fill unwritten blocks with empty data to maintain database integrity.
  • Code Modification: Modifies src/examples/IDBMirrorVFS.js to include a loop that fills in the unwritten blocks.

Changelog

  • src/examples/IDBMirrorVFS.js
    • Introduces a loop to fill unwritten blocks with empty data before writing new data.
    • Calculates fileEnd and blockSize to determine the range of blocks to fill.
    • Updates file.txActive.fileSize to reflect the actual file size after filling the blocks.
    • Ensures that SQLite writes are handled correctly even when they are not sequential.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A disk image astray,
Data's lost in the fray,
Fill the blocks tight,
Make the image right,
No more errors today.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical database corruption issue in IDBMirrorVFS caused by unfilled blocks. The provided code changes introduce a mechanism to fill in these unwritten blocks, preventing the "database disk image is malformed" error. The solution appears well-structured and addresses the problem effectively.

Summary of Findings

  • Unnecessary variable declaration: The blockSize variable is assigned the value of pData.byteLength and immediately used in the loop condition and block setting. It could be inlined directly into the loop and block setting for better readability.
  • Potential performance concern: The loop fills in unwritten blocks with new Uint8Array instances. For very large databases or frequent writes, this could potentially impact performance due to memory allocation. Consider reusing a single buffer or exploring alternative strategies if performance becomes an issue.

Merge Readiness

The code changes effectively address the database corruption issue. While there are minor suggestions for improvement, the core functionality is sound. I recommend addressing the suggestions before merging to enhance code readability and potentially improve performance. I am unable to approve this pull request, and recommend that others review and approve this code before merging.

@Spxg Spxg marked this pull request as draft April 5, 2025 04:57
@Spxg Spxg marked this pull request as ready for review April 5, 2025 05:21
@rhashimoto
Copy link
Owner

rhashimoto commented Apr 9, 2025

CI is failing but not because of the PR. Downloading some source code from sqlite.org is failing, probably because of their move to a new web server. I have reported it and hope they can restore it soon.

Update: Acknowledged as a web server problem and expected to be fixed by in the next day or so. I plan to merge this once CI passes.

@sgbeal
Copy link

sgbeal commented Apr 9, 2025

Update: Acknowledged as a web server problem and expected to be fixed by in the next day or so. I plan to merge this once CI passes.

Part of the server migration entailed using a redirect to forward all of the www. names to their non-www.-prefixed counterparts. We got some reports that automation based on curl was failing for those because curl does not, by default, follow redirects. If the CI is using curl, either adding the -L flag to the call or removing www. from the URL "should" resolve it. A separate issue was discovered today that the servers weren't responding to ipv6-only traffic, and that was resolved a few hours ago.

@rhashimoto
Copy link
Owner

If the CI is using curl, either adding the -L flag to the call or removing www. from the URL "should" resolve it.

@sgbeal I don't think so. Did you see Dr. Hipp's response?

I am using a www prefix but nothing is returned with or without it (you can try with your browser):

$ curl -v https://sqlite.org/contrib/download/extension-functions.c?get=25
* Host sqlite.org:443 was resolved.
* IPv6: (none)
* IPv4: 194.195.208.62
*   Trying 194.195.208.62:443...
* Connected to sqlite.org (194.195.208.62) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=a1.sqlite.org
*  start date: Apr  8 01:47:39 2025 GMT
*  expire date: Jul  7 01:47:38 2025 GMT
*  subjectAltName: host "sqlite.org" matched cert's "sqlite.org"
*  issuer: C=US; O=Let's Encrypt; CN=E5
*  SSL certificate verify ok.
* using HTTP/1.x
> GET /contrib/download/extension-functions.c?get=25 HTTP/1.1
> Host: sqlite.org
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
< Connection: keep-alive
< Date: Wed, 09 Apr 2025 23:24:00 GMT
< Content-length: 0
< 
* Connection #0 to host sqlite.org left intact

@rhashimoto rhashimoto merged commit c087b71 into rhashimoto:master Apr 11, 2025
1 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants