Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Syncstorage inital sqlite support #1520

Draft
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

Eragonfr
Copy link
Contributor

@Eragonfr Eragonfr commented Feb 11, 2024

Description

Re-implementation of #935 on the newer codebase.
Thanks for @mqus for the first implementation.

Issue(s)

Partially fixes #498

Todo

  • Code Cleanup
  • Commit history cleanup (I'm not really working in a clean way)
  • Add Sqlite clippy checks to CircleCI
  • docker container (Thanks @bcspragu for the help)
  • e2e tests image (Thanks @bcspragu for the help)
  • pass the e2e tests

Current state is that it runs on my computer, but it's probably not clean at all, there's still a lot of duplicate code, and some SQL queries are still in MySQL-flavored SQL but could be used by the SQLite backend.

@Eragonfr
Copy link
Contributor Author

To pass the checks I have the choice between upgrading diesel or ignoring the issue with libsqlite-sys (which tbh is probably a non-issue because we should not be using SQLite printf anyway).

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch 2 times, most recently from 1047197 to 9d83956 Compare March 10, 2024 21:36
@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch from 6927869 to 62774e3 Compare April 6, 2024 18:54
@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch from 62774e3 to 5ea3b39 Compare May 1, 2024 09:52
@Eragonfr
Copy link
Contributor Author

Eragonfr commented May 9, 2024

I choose to ignore the libsqlite3-sys CVE because it's not impacting this project. That should let me run the other tests in the CI.

clippy_sqlite:
# Matches what's run in circleci
cargo clippy --workspace --all-targets --no-default-features --features=syncstorage-db/sqlite,tokenserver-db/sqlite -- -D warnings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you saw #1513, but one of the things we're trying to do is to pull more of tokenserver into syncstorage-rs.

(Weird history lesson: Tokenserver was supposed to support more than just sync, but that never really happened. Having it distinct turned out to be a headache for a lot of reasons, so we're folding as much of Tokenserver as we can into the main syncstorage)

As a side note, this also means that the tokenserver storage stuff can probably live along-side the syncstorage data fairly easily. Possibly as just a bunch of other tables.

It's a bit more complicated for us since we have yet to migrate tokenserver over to spanner, but that's on our internal roadmap. When we get to doing that, however, is unknown.

We also have a few crate refactoring tickets open. I'll update those to indicate that we should really put the data handlers for both syncstorage and tokenserver into their own distinct crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not saw that PR.

You want to remove tokenserver completely. Is that right ?

I don't understand what it means for my PR.
Should I change how I handle tokenserver ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that it impacts much of your PR, since you're only focused on the Storage part. There may be a few opportunities for folk running their own "Stand Alone" version to have just one database running instead of two, and depending on how we decide to do this for Spanner, there may be some additional functions we introduce to the database layer. We'll try to isolate and highlight those so that we don't cause more work.

syncserver-db-common/src/test.rs Outdated Show resolved Hide resolved
.cargo/audit.toml Outdated Show resolved Hide resolved
@bcspragu
Copy link

bcspragu commented Aug 8, 2024

I'm really jazzed about this PR, thanks for all the work on it! I noticed this in the PR desc:

  • docker container (Help needed)

If this is a question of building a Docker container from this codebase that bundles all the SQLite deps and nothing more, I'm happy to take on that work.

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Aug 9, 2024

I'm really jazzed about this PR, thanks for all the work on it! I noticed this in the PR desc:

  • docker container (Help needed)

If this is a question of building a Docker container from this codebase that bundles all the SQLite deps and nothing more, I'm happy to take on that work.

Something like that yes. There might be some docker-compose configuration files to make too.
I gave multiple tries to Docker but I never seems to get a "standard" docker environment. I always get something with some broken networking, nuked firewall or something else. At this point I've given up on that technology. (docker should release VMs that I can start and get a "standard" env with networking and stuff)

That would leave me to clean up my code

@bcspragu
Copy link

bcspragu commented Aug 9, 2024

Sounds good, I'll give it a shot!

@bcspragu
Copy link

bcspragu commented Aug 9, 2024

Okay, I think I've got something approximately working in Eragonfr/syncstorage-rs@syncstorage-sqlite...bcspragu:syncstorage-rs:syncstorage-sqlite

E.g.

docker build -t mozsync-sqlite --build-arg DATABASE_BACKEND=sqlite .

# The --userns line is a rootless Podman thing, can be ignored mostly
docker run --rm -it \
    --userns=keep-id:uid=10001,gid=10001 \
    -v $PWD/temp-data-do-not-submit:/data \
    -e "SYNC_HOST=0.0.0.0" \
    -e "SYNC_MASTER_SECRET=secret0" \
    -e "SYNC_SYNCSTORAGE__DATABASE_URL=sqlite:///data/syncdb.sqlite" \
    -e "SYNC_TOKENSERVER__DATABASE_URL=sqlite:///data/tokenserverdb.sqlite" \
    -e "SYNC_TOKENSERVER__RUN_MIGRATIONS=true" \
    mozsync-sqlite

Seems to start up successfully, but I'm not brave enough to connect it to my real Firefox sync yet

@Eragonfr
Copy link
Contributor Author

Thanks @bcspragu !

@bcspragu
Copy link

bcspragu commented Aug 12, 2024

For sure! Also, looking at the current ci/circleci: build-and-test failure, that's a result of missing --features flags here:

command: cargo build

Not sure what the team would prefer here, but there's two options:

  1. If you're just looking for a sanity check, test against MySQL (e.g --features=syncstorage-db/mysql,tokenserver-db/mysql)
  2. Run separate builds for each DB backend
  3. [Actually, this is probably what we want] Fix the default flags to use MySQL correctly

Unrelated, one thing I forgot to mention is that regardless of which features are enabled, MySQL client libs are required on the host (e.g. there are places where it isn't gated behind the feature flag, it gets linked in regardless). I started adding the relevant #cfg(feature(...)) in places where I noticed it, but again, not sure what the team would prefer there.

@Eragonfr
Copy link
Contributor Author

I thought of dropping the default flags because the user should choose their storage backend instead of relying on a default one.


We want to have the proper #cfg(feature(...)) when it's needed without breaking the spanner backend. (It uses MySQL for the token server database)

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Sep 8, 2024

The ci fails on sqlite e2e tests. I think it is because the /data/ folder is missing, which means that the database can't be created and that makes the ci to fail.
I need some help to figure out how to either create this folder or to not need it in the container.

The checks are unrelated and are probably failing on the master branch too.

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Sep 22, 2024

🤦‍♀️ I'm dumb the e2e tests can't work because they require a connection to the database from the python script and syncserver at the same time. Which is not possible with sqlite.
There was no need for my last height commits.
But the e2e tests are not usable with sqlite.

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch 3 times, most recently from 7ca1578 to 1c08e86 Compare September 22, 2024 23:08
@JulianWgs
Copy link

Hi @Eragonfr , thanks for your hard work! I am trying to get this to run locally, but I can't execute this query against the token database, because the database file is empty. Am I doing something wrong? Could you try running the server with a missing token database and check if the server correctly sets it up? Thanks again! I am really looking forward to this!

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Oct 13, 2024

@JulianWgs I also have the issue. I think that I broke the auto migration when doing some refactor.
The tokenserver part is giving me some trouble to get it to use SQLite queries, it doesn't pass its unit tests and there is quite a lot of its queries that doesn't run.

EDIT: Can't reproduce that bug anymore… wtf

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch 4 times, most recently from ff5be44 to a8dda23 Compare October 19, 2024 23:16
@Eragonfr
Copy link
Contributor Author

Eragonfr commented Dec 17, 2024

If anyone has an idea on how to get the e2e tests to run, I'll take it.
What is needed is for syncstorage to close the sqlite database, the test script can then open the database.
The test script check the values in the database then closes the file and query syncstorage for the next test.
Syncstorage can then re-open the database for the transaction.
Rince and repeat.

But I can't figure out how to force the Rust SQLite driver (diesel with r2d2) and the Python driver (sqlalchemy) to close the database after each transaction.

One option would be to stop syncstorage after each transaction and restart it after the check.
But it wouldn't fix the issue with the tests keeping the database open.
And it adds a new issue that stopping and launching the server is quite heavy and slow. (And I don't think that it would play nicely with the test containers)

That's why there is currently no new progress. I just don't understand enough of both database driver to force them to do something that they are not designed to do. (I mean closing your connection is horrible in term of efficiency, a good driver should never do that after each transaction)

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Dec 18, 2024

I tried, it only caused the MySQL e2e test to fail c8f21bd

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Jan 2, 2025

At this point ffd1bd9 it's certain that it's python that holds the lock on the database but I really don't know how I can force it to close the file

@Eragonfr
Copy link
Contributor Author

Eragonfr commented Jan 3, 2025

I found a way for python to not pool anything and not keep the lock on the file.
But now it's syncstorage that hold the lock for too long

One idea for syncstorage is that I have to implement a fake DbPool for SQLite e2e tests. This pool would open a new connection when a connection is requested from it. And close the connection as it is returned.
That should (hopefully) fix the issue with r2rd and diesel that the default diesel pool implementation doesn't guarantee that the connection is closed when released (as usually you want to reuse the connection later)

EDIT: It seems far easier for me to add PostgreSQL as a backend and pass e2e tests immediately than to fix the mess with SQLite
EDIT2: Also it seems that I'll have to squash some commits to clean the history from all of my tests. I found how to run the e2e tests locally, I won't have to commit and push for the test to run now.

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch from a2a033f to 45f5c82 Compare January 3, 2025 18:27
@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch from 45f5c82 to ed0e727 Compare January 3, 2025 19:05
@kyz
Copy link

kyz commented Jan 14, 2025

But I can't figure out how to force the Rust SQLite driver (diesel with r2d2) and the Python driver (sqlalchemy) to close the database after each transaction.

I'm just a passer-by offering my opinion, and I haven't tested this, but hopefully it helps.

SQLite supports a busy handler and WAL mode to allow multiple writers and readers, across multiple processes, to use the same database without blocking each other. It's still only allowing one write transaction at a time, but processes automatically coordinate with each other through a shared memory file as to who has the lock.

Here is a good article on StackOverflow on enabling WAL with Diesel / r2d2, including some sample code for inspiration.

I imagine you would only want to enable this for tests where you want this concurrent access.

I haven't used SQLAlchemy, but the Python tests should use the same SQLite pragmas as Rust to access the database file, e.g. conn.execute("PRAGMA journal_mode = WAL; PRAGMA synchronous = NORMAL; PRAGMA busy_timeout = 10000;")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support sqlite/postgresql backend
6 participants