Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support run multiple sql statements from file in DBApiHook #16979

Closed
flolas opened this issue Jul 13, 2021 · 8 comments
Closed

Support run multiple sql statements from file in DBApiHook #16979

flolas opened this issue Jul 13, 2021 · 8 comments
Assignees
Labels
kind:feature Feature Requests

Comments

@flolas
Copy link
Contributor

flolas commented Jul 13, 2021

Description

Support run multiple sql statements from file in JdbcOperator DBApiHook, when the provided file has multiple statements (we can check it with https://github.com/andialbrecht/sqlparse)

Use case / motivation

Running single sql file with multiple sentences with JdbcOperator or any other DBApi operator (uses run method from DbApiHook, maybe we can implement here or in JdbcOperator):

Are you willing to submit a PR?
yes

@flolas flolas added the kind:feature Feature Requests label Jul 13, 2021
@potiuk
Copy link
Member

potiuk commented Jul 14, 2021

One comment - this should be definitely done on DBAPI hook, not in JdbcOperator. This way all DB operators will be able to use it. Assigning you to it :)

@potiuk potiuk changed the title Support run multiple sql statements from file in JdbcOperator Support run multiple sql statements from file in DBApiHook Jul 14, 2021
@grahovam
Copy link

I am already using the PostgresOperator and MySqlHook to execute SQL queries that contain multiple queries seperated by semicolons. And both the operator and the hook use DbApiHook for doing so. Am I missing something here? Thank you for any clarification.

@eladkal
Copy link
Contributor

eladkal commented Jul 14, 2021

I am already using the PostgresOperator and MySqlHook to execute SQL queries that contain multiple queries seperated by semicolons. And both the operator and the hook use DbApiHook for doing so. Am I missing something here? Thank you for any clarification.

Assuming you have a SQL script:
Select 1; Select 2; Both Postgres and MySQL are able to execute this script. However you will not be able to execute this script in Snowflake, Presto nor Trino. This is why for these databases we first parse the script and break the statements one by one:

def _strip_sql(sql: str) -> str:
return sql.strip().rstrip(';')

split_statements_tuple = split_statements(StringIO(sql))
sql = [sql_string for sql_string, _ in split_statements_tuple if sql_string]

After that we can execute the statements one by one.

I think though there is a catch here. Executing Update table a ... ; Update table b...; may not the same as executing Update table a ... ; and then executing Update table b...; when we break the statements there is a question of when the commit(s) gets executed.

@grahovam
Copy link

So the idea here is to not break the statements (and execute one by one) inside TrinoHook nor SnowflakeHook anymore and therefore break the statements (and execute one by one) inside DBApiHook?

@potiuk
Copy link
Member

potiuk commented Jul 19, 2021

Yeah. Sounds like the third time is the charm

@potiuk
Copy link
Member

potiuk commented Jul 26, 2021

Hey @flolas - how is it going ? I saw a few people interested in it :)

@flolas
Copy link
Contributor Author

flolas commented Jul 26, 2021

@potiuk Hi! I will work this week on the PR.

@eladkal
Copy link
Contributor

eladkal commented Aug 27, 2022

Should be covered fully by #23971

@eladkal eladkal closed this as completed Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature Feature Requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants