-
Notifications
You must be signed in to change notification settings - Fork 672
feat(api): implement upsert()
using MERGE INTO
#11624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5a27b34
to
21994eb
Compare
@cpcloud Requesting review for initial feedback while I try to improve backend support; I assume I could always mark a lot of them |
ibis/backends/oracle/__init__.py
Outdated
# TODO(deepyaman): Fix Oracle MERGE query generation in SQLGlot. | ||
if "MERGE" in query: | ||
assert "AS " in query | ||
query = query.replace("AS ", "") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obviously unacceptable, but will wait on tobymao/sqlglot#5910 to try and figure out what's the best way to work arounds this (unless somebody else has ideas).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed here tobymao/sqlglot#5911, will release soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@georgesittas Tested locally with SQLGlot 27.16.3, and removing this hack works perfectly. Thank you!!
(So impressed by the 0-issue inbox on SQLGlot BTW.)
ibis/backends/sql/__init__.py
Outdated
expressions=[ | ||
sg.column( | ||
col, | ||
table=target_alias if self.name == "oracle" else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These if
statements can before backend-specific variables, something like https://github.com/ibis-project/ibis/blob/main/ibis/backends/flink/__init__.py#L65-L66? I wanted to try and see what the prevailing pattern is before finalizing the default.
46b0e4a
to
8e47b79
Compare
query = query.sql(self.dialect) | ||
|
||
if "MERGE" in query: | ||
query = f"{query};" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not sure how to better do this; I wasn't able to figure out how to add a semicolon to an expression in SQLGlot.
2e1403d
to
b40103e
Compare
25061ad
to
d052b43
Compare
0ea427e
to
9ae5426
Compare
ea32ee2
to
d2b5922
Compare
d2b5922
to
8b0d6fe
Compare
@cpcloud I'm temporarily cherry-picked the changes from #11636 and the updates to With that, all of the functionality to implement |
Description of changes
Implement
Backend.upsert()
usingsqlglot.expressions.merge()
under the hood. Upsert support is very important, especially for data engineering use cases.Starting with a most basic implementation, including only supporting one join column. I think this could be expanded to support a list without much effort.
MERGE INTO
support is limited. DuckDB only added support forMERGE
statements earlier today in 1.4.0, and many other backends don't support it. However, it seems like the more standard/correct approach for supporting upserts, and it doesn't require merge keys defined ahead of time on tables.Backends that work:
(currently using a hack to work around "AS" getting added to MERGE statement);
onto the end ofeverystatement 😅)Should work, need help to test:
Backends that don't work:
MERGE
statement is only supported for Iceberg tables.")MERGE INTO
is transactional and is supported only for Apache Iceberg tables in Athena engine version 3.")Issues closed
Backend.upsert
#5391