-
-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasette with many and large databases > Memory use #1880
Comments
Follow on question - is all memory use @simonw - for both datasette and SQLlite confined to the "query time" itself i.e. the memory use is relevant only to a particular transaction or query - and then subsequently released? |
I think you may have misunderstood this feature. This is talking about the They're not a copy of the data itself - just a list of table names, column names and database names. You can see what that database looks like by signing in as root - running
For the example instance that looks like this: The two most interesting tables in there are these ones: As you can see, it's just the table schema itself and the columns that make up the tables. Even if you have hundreds of databases connected each with hundreds of tables this should still only add up to a few MB of RAM. |
The So I want to be able to paginate (and search) those. But to paginate them it's useful to have them in a database table itself, since then I can paginate using SQL. My plan for |
I appreciate your response @simonw - thanks! I'll clarify what we need further - let's imagine we have 2000 SQLLite databases (for 2000 tenants), but we only want to run one datasette instance for each of those tenants to query/use datasette against their own database only. This means the "connection" between datasette and the SQLLite database would be dynamic, based on the tenantID that's required on an incoming request. Is there any specific config or other considerations in this use case, to minimize memory use on a single, efficient VM and serve queries to all these tenants? cc @MuAdham |
The above is from the docs ^. There's two problems here - the number of datasette "instances" in a single server/VM and the size of the database itself. We want the opposite of in-memory, including what happens on SQLlite - documented in https://www.sqlite.org/inmemorydb.html
From the context in #1150 - does it mean datasette is memory-bound to the size of the dataset - which might be a deal-breaker for many large-scale use cases?
In an extreme case - let's say a single server had 100 SQLlite databases, which would enable 100 "instances" of datasette to run, one per client (e.g. in a SaaS multi-tenant environment). How could we achieve all these goals:
Any ideas appreciated - we're looking to use this in a SaaS type of setting - many instances, single server.
@simonw great work on datasette, in general! Possibly related to #1480 but we don't want use any kind of serverless infra - this is a long-running VM/server.
The text was updated successfully, but these errors were encountered: