-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let virtual table implementations know which SQL string is being executed #87
Conversation
This commit adds a new method named xPrepareSql to sqlite3_module and an associated opcode VPrepareSql. That opcode is emitted immediately before a VFilter opcode when querying virtual tables. xPrepareSql takes two arguments: a pointer to the virtual table's cursor and, missing from existing sqlite3_module methods, the SQL string being executed by the application. Prior to the introduction of this method, a virtual table that mirrored a remote table had no way to tell which columns of the remote server were requested by the application. Therefore, such an implementation would typically `SELECT *` from the remote table and let callbacks such as xColumn() and xNext() decide which data to send back to the application. The problem with this approach is that queries that SELECT just a subset of the remote columns trigger the transfer of data over the network even though they won't be used. xPrepareSql gives an opportunity for the virtual table implementation to inspect the query string and selectively choose which columns from the remote server to pull and cache locally. Because this change introduces a new method to sqlite3_module, new modules wishing to implement a callback for xPrepareSql must declare a module version (iVersion) 4 or above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left a few minor comments.
src/sqlite.h.in
Outdated
@@ -7076,6 +7076,8 @@ struct sqlite3_module { | |||
/* The methods above are in versions 1 and 2 of the sqlite_module object. | |||
** Those below are for version 3 and greater. */ | |||
int (*xShadowName)(const char*); | |||
/** The methods below are for version 4 and greater. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's going to get ambiguous if/when SQLite adds some other field and bump the current version to 4, and we'd like to keep libSQL compatible for as long as possible. There's no perfect solution here, but in an (unmerged yet) pull request I used version "700", to indicate that it's a libSQL extension, and it makes the collision less likely to happen in the next few years. Perhaps the same trick could be applied here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. HDF5 adopts a similar approach when dealing with I/O filters: a certain range of numbers is reserved for filters contributed by the community. I will follow your suggestion and will check for iVersion >= 700
.
src/vdbe.c
Outdated
} | ||
} | ||
|
||
if( rc ) goto abort_due_to_error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line looks redundant - it's already checked in line 8101, and line 8100 is the only one in this block that assigns to rc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, indeed. Thanks for spotting that!
src/vdbe.c
Outdated
if( pModule->iVersion>=4 ){ | ||
if( pModule->xPrepareSql && p->zSql ){ | ||
rc = pModule->xPrepareSql(pVCur, p->zSql); | ||
if( rc ) goto abort_due_to_error; // TODO set and test error msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you elaborate on the TODO? As long as the implementation of xPrepareSql assigns a proper error code, it will be properly handled and printed in the code that follows abort_due_to_error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a reminder to test the error path, but I ended up leaving it behind by mistake. I will remove it.
src/vdbe.c
Outdated
|
||
/* Invoke the xPrepareSql method */ | ||
if( pModule->iVersion>=4 ){ | ||
if( pModule->xPrepareSql && p->zSql ){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - in which cases is p->zSql
null? I wonder if the responsibility for checking and handling NULL shouldn't be pushed to the implementation of xPrepareSql
- perhaps somebody would like to do something specific even for cursors that don't have the SQL text?
This is a genuine question, as I haven't implemented any virtual SQLite tables in my life (: if it doesn't make sense, then feel free to leave the code as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added that as a sanity check. A NULL value for p->zsql
would be possible when handling the Init
or Trace
opcodes, but that is not expected when processing a valid SQL statement.
@lucasvr just to prevent this PR from going stale - would you like us to take over and apply the fixes or would you prefer working on v2 yourself? |
Apologies for the late response! Let me work on v2 - I will send you an updated PR in the next few hours. Also, I appreciate your review and comments -- thanks for that! |
This commit also removes a comment that was left over on the previous changeset.
Following @psarna's advice, use iVersion >= 700 for community-contributed methods.
This name change is motivated to prevent misunderstandings: the callback is not meant to output a prepared SQL statement. Rather, the callback is being informed of the SQL statement prepared by SQLite.
@psarna I've added 2 commits on top of the original one that address your comments. I have also introduced a name change: the method is now called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good to me. @penberg could you take a cursory look just to have a second pair of eyes on it? And then we can go ahead and merge
For reference, the current SQLite vtab implementation does already give information about how the table is being queried, in the Specifically, the It's not the friendliest API, but I find dealing with those fields much easier than parsing raw SQL. It also doesn't handle |
@asg017 |
Enhance documentation by describing the new xPreparedSql callback for virtual tables. Refs tursodatabase#87
Enhance documentation by describing the new xPreparedSql callback for virtual tables. Refs #87
Enhance documentation by describing the new xPreparedSql callback for virtual tables. Refs tursodatabase#87
Fix data row serialization for the PostgreSQL wire protocol: - Send column values as part of a data row instead a data row per column. - Fix column name and type information in row description. Refs #87
87: box parser r=psarna a=MarinPostma This pr puts the parser on the heap. It's quite chunky, and caused stackoverflows in the past. I'm working to get a nicer patch to upstream lemon-rs, but it requires more work for somilar effects. I also remove the dependency on my patch Co-authored-by: ad hoc <postma.marin@protonmail.com>
SQLite provides an interface to expose different data sources (such as files or other database systems) as virtual tables. There are several callbacks one must implement to enable e.g., connection handling and pagination, but one key attribute is missing on all callbacks: the query string being executed by the application.
The query string is especially relevant when virtual tables are used to mirror data from external databases hosted on different machines. A virtual table implementation that knows about the SQL string is able to reduce the amount of data transferred over the network when an application is only concerned about a subset of the columns.
This patch adds a new opcode
VPrepareSql
that's called immediately beforeVFilter
when dispatching queries to a virtual table. The new opcode triggers the execution of the newxPrepareSql
method which provides the SQL text to the virtual table implementation.I am open to suggestions regarding the implementation approach and the new method's name. I'm looking forward to reading your feedback. Thanks!