Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SQLAlchemy 1.4 #391

Merged
merged 19 commits into from
Jun 1, 2022
Merged

Add support for SQLAlchemy 1.4 #391

merged 19 commits into from
Jun 1, 2022

Conversation

amotl
Copy link
Member

@amotl amotl commented Nov 16, 2020

Summary of the changes / Why this is an improvement

SQLAlchemy 1.4.0 is about to be released. SQLAlchemy 1.4.37 has been released. People are eagerly waiting for corresponding support.

@amotl
Copy link
Member Author

amotl commented Nov 16, 2020

SQLAlchemy 1.4 dropped support for Python 3.5, see sqlalchemy/sqlalchemy@d0b5ce2. However, Python 2.7 is still supported.

@amotl amotl force-pushed the amo/sqlalchemy14 branch 3 times, most recently from f598944 to 8ba3b99 Compare November 16, 2020 14:38
@amotl
Copy link
Member Author

amotl commented Nov 16, 2020

Observations

Tests fail with:

      File "/home/runner/work/crate-python/crate-python/eggs/SQLAlchemy-1.4.0b1-py3.8-linux-x86_64.egg/sqlalchemy/sql/compiler.py", line 508, in process
        return obj._compiler_dispatch(self, **kwargs)
      File "<string>", line 9, in _compiler_dispatch
      File "/home/runner/work/crate-python/crate-python/src/crate/client/sqlalchemy/compiler.py", line 201, in visit_insert
        crud_params = crud._get_crud_params(self, insert_stmt, **kw)
    TypeError: _get_crud_params() missing 1 required positional argument: 'compile_state'

-- https://github.com/crate/crate-python/pull/391/checks?check_run_id=1406879590#step:5:139

Root cause

The signature of _get_crud_params() has been updated through sqlalchemy/sqlalchemy@851fb8f and sqlalchemy/sqlalchemy@693938d, which decouples compiler state from DML objects. This is probably related to

Background

Version 1.4 is taking on a different focus than other SQLAlchemy releases in that it is in many ways attempting to serve as a potential migration point for a more dramatic series of API changes currently planned for release 2.0 of SQLAlchemy.

-- What’s New in SQLAlchemy 1.4?

@amotl amotl force-pushed the amo/sqlalchemy14 branch 3 times, most recently from 78f15ec to 0b1d8cb Compare November 16, 2020 21:12
@amotl
Copy link
Member Author

amotl commented Nov 16, 2020

Dear @chaudum and @m-kharbat,

I've tried to apply some minor updates from SQLAlchemy 1.4 to our visit_insert() and visit_update() methods.

def visit_insert(self, insert_stmt, asfrom=False, **kw):
"""
used to compile <sql.expression.Insert> expressions.
this function wraps insert_from_select statements inside
parentheses to be conform with earlier versions of CreateDB.
"""

def visit_update(self, update_stmt, **kw):
"""
used to compile <sql.expression.Update> expressions
Parts are taken from the SQLCompiler base class.
"""

However, after I didn't have quick success when making a few adjustments and after reading the comments above, I wanted to check back with you about the further plan. Regarding the conform with earlier versions of CreateDB comment: Maybe we don't need any amendments coming from these functions anymore and drop support for CrateDB 0.x releases?

Apart from that, I also found that some bits within these methods are apparently related to still support SQLAlchemy 1.0 and 1.1. Maybe we also want to revise this and drop respective support?

With kind regards,
Andreas.

Copy link
Member Author

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

39d0101, after some other minor adjustments, magically made the test suite succeed, by adjusting the parameter mangling to/from crate_before_execute.

Other than the inline comments below, some other items could/should be addressed before finalizing this patch. Please let me know about your suggestions and if you can spot other shortcomings.

  1. Special attention should be given to the new SQL Compilation Caching feature [1,2]. The CrateDialect should supports_statement_cache = True, see [3]. Also, there is a new cache_ok attribute on custom types [4], which should probably be applied to all CrateDB SQLAlchemy types listed in crate.client.sqlalchemy.types. [5] might also be related here.

    Currently, we observe all those warnings referenced at SA's identify what constructs are blocking caching from being enabled documentation:

    SAWarning: Dialect crate:crate-python will not make use of SQL compilation caching as it does not set the supports_statement_cache attribute to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Dialect maintainers should seek to set this attribute to True after appropriate development and testing for SQLAlchemy 1.4 caching support. Alternatively, this attribute may be set to False which will disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

    SAWarning: UserDefinedType {_Craty(),_ObjectArray(),Geopoint(),Geoshape()} will not produce a cache key because the cache_ok attribute is not set to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this type object's state is safe to use in a cache key, or False to disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

    SAWarning: {Class Any,Class Match} will not make use of SQL compilation caching as it does not set the inherit_cache attribute to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Set this attribute to True if this object can make use of the cache key generated by the superclass. Alternatively, this attribute may be set to False which will disable this warning. (Background on this error at: https://sqlalche.me/e/14/cprf)

  2. Go through the list of significant changes [6] thoroughly and summarize/compile relevant details regarding CrateDB into the changelog file.

  3. Test Asynchronous I/O with asyncpg, see [7].

    engine = create_async_engine(
      "postgresql+asyncpg://scott:tiger@localhost/test"
    )
    

[1] https://docs.sqlalchemy.org/en/14/core/connections.html#sql-compilation-caching
[2] https://docs.sqlalchemy.org/en/14/faq/performance.html#faq-new-caching
[3] https://docs.sqlalchemy.org/en/14/core/connections.html#caching-for-third-party-dialects
[4] https://docs.sqlalchemy.org/en/14/core/custom_types.html
[5] https://docs.sqlalchemy.org/en/14/errors.html#caching-caveats
[6] https://docs.sqlalchemy.org/en/14/changelog/migration_14.html
[7] https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html

CHANGES.txt Outdated Show resolved Hide resolved
@@ -74,7 +79,24 @@ def rewrite_update(clauseelement, multiparams, params):
def crate_before_execute(conn, clauseelement, multiparams, params):
Copy link
Member Author

@amotl amotl May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to properly rewrite parameters for update statements

On the topic of using events for rewriting update statements, @zzzeek suggested at sqlalchemy/sqlalchemy#5915 (comment):

For the before_execute() thing, if you want to rewrite paramaters, a much better place for a dialect to do that is in dialect.do_execute() and dialect.do_executemany() (will work in 1.3 also). See for example psycopg2 do_executemany() that does a bunch of rewriting. https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/postgresql/psycopg2.py#L905

Also at sqlalchemy/sqlalchemy#5915 (comment):

You should not have to use any events, that is, no DialectEvents or ConnectionEvents at all; the code you are writing is a dialect. You would implement simple methods do_execute() and do_executemany() as methods of your class CrateDialect right here: https://github.com/crate/crate-python/blob/master/src/crate/client/sqlalchemy/dialect.py#L160 ; these override the default methods that are on DefaultDialect. dialects shouldnt use events for these kinds of things, the dialect implements the actual way that these things happen without any events riding on them.

Thank you again!

Copy link
Member Author

@amotl amotl May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't followed Mike's suggestion yet, after I was able to make the patch work with 39d0101. We will probably have to decide whether to drop support for SQLAlchemy 1.1 and 1.2? What are your opinions about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At sqlalchemy/sqlalchemy#5915 (reply in thread), I shared further thoughts and asked for guidance how to improve that specific topic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't followed Mike's suggestion yet, after I was able to make the patch work with 39d0101. We will probably have to decide whether to drop support for SQLAlchemy 1.1 and 1.2? What are your opinions about this?

As Mike already pointed out, there are 1) not much users using these old version 2) if the do, they probably use an older version of our driver as well.
So +1 for dropping the support of SQLAlchemy < 1.3.

Copy link
Member Author

@amotl amotl May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, let's do it. I would prefer to bring in corresponding cleanups and improvements on behalf of the next iteration, if you don't have any objections. Otherwise, please let me know.

Copy link
Member Author

@amotl amotl May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#424 implemented the update to drop support for corresponding EOL software components, i.e. SQLAlchemy 1.1 and 1.2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the original topic in this thread, the migration to hook into ExecutionContext.pre_exec() for rewriting UPDATE statements instead of using engine events, see #425.

src/crate/client/sqlalchemy/dialect.py Show resolved Hide resolved
src/crate/client/sqlalchemy/monkey.py Outdated Show resolved Hide resolved
@crate crate deleted a comment from lgtm-com bot May 25, 2022
Comment on lines 614 to 642
# The rewriting logic in `rewrite_update` and `visit_update` needs
# adjustments here in order to prevent `sqlalchemy.exc.CompileError:
# Unconsumed column names: characters_name, data['nested']`
"""
if parameters and stmt_parameter_tuples:
check = (
set(parameters)
.intersection(_column_as_key(k) for k, v in stmt_parameter_tuples)
.difference(check_columns)
)
if check:
raise exc.CompileError(
"Unconsumed column names: %s"
% (", ".join("%s" % (c,) for c in check))
)
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment does not really help me in understanding the issue like why are there unconsumed column names.
Is it still valid? If so can you please comment more about the why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for asking. It looks like through the special rewriting logic of the CrateDB driver, this sanity check by SQLAlchemy croaks. I don't actually know the "why", but I tried to improve the inline comment with 57d5941 and also added a backlog item to investigate further, see also #425.

Comment on lines +571 to +572
# special logic that only occurs for multi-table UPDATE
# statements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code speaks for itself here, I don't think the comment adds much value

Copy link
Member Author

@amotl amotl Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is part of the code which was vendored from upstream SA14. I didn't want to introduce any deviances to code which was copied over verbatim.

amotl added 5 commits June 1, 2022 12:03
This patch follows the documentation at [1] and adds the attributes
`supports_statement_cache`, `cache_ok` and `inherit_cache` to the
corresponding classes of `CrateDialect`, `UserDefinedType` and
`ColumnElement`.

The test suite thoroughly accepted this change of setting all attributes
to `True`, effectively fully supporting SQL compilation caching.

[1] https://docs.sqlalchemy.org/en/14/faq/performance.html
Beforehand, after observing errors like::

  RuntimeError: number of values in row (1) differ from number of column processors (0)

we decided to take the `resultproxy` out of the equation in order to
work on the real gist of SQLAlchemy 1.4 compatibility.

When looking at this again, it turned out that the root cause have been
inaccuracies when setting up fake cursors within the test cases.
SQLAlchemy <1.4 might have been more forgiving on that, but >=1.4 isn't.
SADeprecationWarning: The argument signature for the
`ConnectionEvents.before_execute` event listener has changed as of
version 1.4, and conversion for the old argument signature will be
removed in a future release. The new signature is
`def before_execute(conn, clauseelement, multiparams, params, execution_options)`.
@amotl amotl requested a review from seut June 1, 2022 10:35
@crate crate deleted a comment from lgtm-com bot Jun 1, 2022
@crate crate deleted a comment from lgtm-com bot Jun 1, 2022
@lgtm-com
Copy link

lgtm-com bot commented Jun 1, 2022

This pull request introduces 1 alert when merging 8a817fe into 67ec8e3 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

@crate crate deleted a comment from lgtm-com bot Jun 1, 2022
Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Maybe wait for @mfussenegger to have a final look at it?


# A module is imported with the "import" and "import from" statements.
# https://lgtm.com/rules/1818040193/
- exclude: py/import-and-import-from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work? #391 (comment)

Copy link
Member Author

@amotl amotl Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't ;[. At least, it does not work while not being merged. I hope that the additional rule will be recognized after the merge. However, it is weird because the other rule about py/not-named-self apparently seems to work well.

@lgtm-com
Copy link

lgtm-com bot commented Jun 1, 2022

This pull request introduces 1 alert when merging 00d016e into 67ec8e3 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

Copy link
Member

@mfussenegger mfussenegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The low coverage + duplication of the visit_update_14 still doesn't feel right to me, but I guess it's good enough to move forward

CHANGES.txt Outdated
Comment on lines 40 to 49
SQLAlchemy 1.4 became stricter on some details. Where it was ok to use a
textual column expression in *plain text* beforehand, a
`SQLAlchemy literal_column`_ type should be used now. This specifically
applies to `CrateDB system columns`_ like ``_score``.

For example, when a query might have looked like this beforehand::

session.query(Character.name, '_score')

it should now be written like::

session.query(Character.name, sa.literal_column('_score'))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```suggestion
SQLAlchemy 1.4 requires to wrap system columns like ``_score`` in a `SQLAlchemy literal_column`_ type. Before it was possible to use a query like this::

     session.query(Character.name, '_score')

It must now be written like::

     session.query(Character.name, sa.literal_column('_score'))

The text before used should but it's not an optional thing, users must change this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

187a11d improves the wording, following your suggestion. Thank you again!

@amotl
Copy link
Member Author

amotl commented Jun 1, 2022

The low coverage + duplication of the visit_update_14 still doesn't feel right to me, but I guess it's good enough to move forward.

Thank you, I added your suggestion to crate/sqlalchemy-cratedb#74.

@lgtm-com
Copy link

lgtm-com bot commented Jun 1, 2022

This pull request introduces 1 alert when merging aad1458 into 67ec8e3 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants