Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SQL split string to include ;-less statements #25713

Merged
merged 1 commit into from
Aug 15, 2022

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Aug 14, 2022

There was a bug in an incoming change to common-sql provider
introduced in #23971 where ;-less statements were removed
when "split_statements" flag was used. Since this flag is used
by default in Databricks statement, it introduced backwards
incompatible change.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Aug 14, 2022

cc: @kazanzhy @alexott

@potiuk potiuk force-pushed the fix-split-sql-string-in-common-sql branch from e77b396 to 2c9c098 Compare August 14, 2022 16:24
@potiuk potiuk requested a review from uranusjr August 14, 2022 16:25
@potiuk potiuk force-pushed the fix-split-sql-string-in-common-sql branch from 2c9c098 to b73e910 Compare August 14, 2022 16:28
@potiuk
Copy link
Member Author

potiuk commented Aug 14, 2022

@kazanzhy @alexott - I chose a little different approach than proposed in #25640 (comment) - after adding unit tests. I generaly filter out all empty statements during the split - that will even allow comment-only statements for example

Copy link
Contributor

@alexott alexott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kazanzhy
Copy link
Contributor

kazanzhy commented Aug 14, 2022

Sorry for being offline. I tried to work on #25259 finally.

Initially, I took this code from the drill provider where it was like:

sql = sqlparse.split(sqlparse.format(self.sql, strip_comments=True))
no_term_sql = [s[:-1] for s in sql if s[-1] == ';']

which is the like current [s.rstrip(';') for s in splits if s.endswith(';')]

I tested it with different queries and even with comments and I see that I really missed the query like 'SELECT * FROM table;;;;;' and query without ;

There was a bug in an incoming change to common-sql provider
introduced in apache#23971 where `;-less` statements were removed
when "split_statements" flag was used. Since this flag is used
by default in Databricks statement, it introduced backwards
incompatible change.
@potiuk potiuk force-pushed the fix-split-sql-string-in-common-sql branch from b73e910 to 1224477 Compare August 14, 2022 19:44
@potiuk
Copy link
Member Author

potiuk commented Aug 15, 2022

I think this one should be ready to review :).

@potiuk potiuk merged commit 7a19651 into apache:main Aug 15, 2022
@potiuk potiuk deleted the fix-split-sql-string-in-common-sql branch August 15, 2022 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants