Skip to content

In MsSqlHook, SQLAlchemy engine scheme is overriden by the change in #40669 #42664

@emredjan

Description

@emredjan

Apache Airflow Provider(s)

common-sql

Versions of Apache Airflow Providers

apache-airflow-providers-common-sql==1.16.0
apache-airflow-providers-microsoft-mssql==3.9.0
apache-airflow-providers-odbc==4.7.0

Apache Airflow version

2.10.2

Operating System

RHEL 8.10

Deployment

Virtualenv installation

Deployment details

No response

What happened

We're using mssql+pyodbc connections in airflow to connect to our MS SQL Server. We're creating a hook using MsSqlHook and using .get_sqlalchemy_engine() to get an sql alchemy engine object to pass along our database tasks.
Prior to apache-airflow-providers-common-sql 1.15.0, this was working as expected (last known working version is 1.14.1, please see below for a working example)
With a change introduced in 1.15.0 (#40669), .get_sqlalchemy_engine() passes the creator argument with the value self.get_conn to the .create_engine() function, which overrides the engine creation with the default connection scheme, which uses pymssql, even though we specifically define the scheme as pyodbc in the connection, and in the hook.

Resulting engine object becomes a weird combination of two, with a mssql+pyodbc scheme in the connection uri, but tries to connect to using the pymssql dialect internally, which results in fatal errors:

image

As if it's trying to make a pymssql connection, but with a pyodbc URL.

What you think should happen instead

The connection should work, as it does prior to the change.

image

If we comment out / delete the line engine_kwargs["creator"] = self.get_conn in ‎airflow/providers/common/sql/hooks/sql.py (f6c7388#diff-6e1b2f961cb951d05d66d2d814ef5f6d8f8bf8f43c40fb5d40e27a031fed8dd7R246), connections works as expected.

How to reproduce

This is the connection that's used (values changed for privacy)

Basically use any MSSQL connection with a mssql+pyodbc scheme

{
    "conn_type": "mssql",
    "host": "<redacted>",
    "login": "user",
    "password": "pass",
    "schema": "master",
    "port": 1433,
    "extra": {
        "driver": "ODBC Driver 18 for SQL Server",
        "encrypt": "yes",
        "sqlalchemy_scheme": "mssql+pyodbc"
    }
}

Try to connect through a hook and sqlalchemy engine:

from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
hook = MsSqlHook(mssql_conn_id='conn1')  # conn1: the id for above connection
#hook = MsSqlHook(mssql_conn_id='conn1', sqlalchemy_scheme='mssql+pyodbc') # optional, same result
engine = hook.get_sqlalchemy_engine()
engine.connect()

This fails on > 1.15.0
Works on <= 1.14.1

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions