Skip to content

Conversation

@feluelle
Copy link
Member

@feluelle feluelle commented Sep 28, 2022

In PostgresHook the "schema" field is only being called like that to make it compatible with the underlying DbApiHook which uses the schema for the sql alchemy connector. The postgres connector library however does not allow setting a schema in the connection instead a database can be set. To clarify that, we change all references in the PostgresHook code and documentation.

related: #26734


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@feluelle
Copy link
Member Author

feluelle commented Sep 28, 2022

Added 1da4ee7

To clarify, the tests were testing code which is not meant to be true for every DbApiHook as it does not set the self.schema. There is even a note already in the init covering this:

# We should not make schema available in deriving hooks for backwards compatibility
# If a hook deriving from DBApiHook has a need to access schema, then it should retrieve it
# from kwargs and store it on its own. We do not run "pop" here as we want to give the
# Hook deriving from the DBApiHook to still have access to the field in it's constructor
self.__schema = schema

@jedcunningham jedcunningham added the full tests needed We need to run full set of tests for this PR to merge label Sep 28, 2022
@feluelle
Copy link
Member Author

Thanks @jedcunningham, I will also fix the other issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: Should we add schema as property and raise deprecation warning for folks relying on PostgresHook().schema? Or we rather give them AttributeError as it is wrong anyways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this. @ashb @potiuk maybe?

Copy link
Member

@ashb ashb Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can even make it deprecated really:

This is in DbAPIHook:

    def get_conn(self):
        """Returns a connection object"""
        db = self.get_connection(getattr(self, self.conn_name_attr))
        return self.connector.connect(host=db.host, port=db.port, username=db.login, schema=db.schema)

So we need to add a

    @property
    def schema(self):
        return self.schema

I think. I can't see how we can do that without breaking anything which uses dbapi hook, such as SQLTableCheckOperator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.schema there refers to the connection schema - not the hook schema attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right yes.

In which case I think it's still worth having a deprecation warning property.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 4a7ae31

@feluelle feluelle requested a review from eladkal as a code owner October 11, 2022 08:09
@potiuk
Copy link
Member

potiuk commented Oct 22, 2022

Conflicts need fixing :(

feluelle and others added 11 commits October 24, 2022 07:50
In PostgresHook the "schema" field is only being called like that to make it compatible with the underlying DbApiHook which uses the schema for the sql alchemy connector. The postgres connector library however does not allow setting a schema in the connection instead a database can be set. To clarify that, we change all references in the PostgresHook code and documentation.
- use deprecation warning instead of log.error
- clarify docs about database & schema
"schema" is not a public nor internal member of DbApiHook hence tests should not access it
@feluelle feluelle force-pushed the fix/postgres-schema-database-inconsistency branch from acd0bb6 to 465166b Compare October 24, 2022 05:51
@feluelle
Copy link
Member Author

@potiuk Done. 😊

@potiuk potiuk merged commit 39caf1d into apache:main Oct 31, 2022
pankajastro added a commit to astronomer/astro-sdk that referenced this pull request Nov 21, 2022
# Description

## What is the current behavior?

Currently, the CI is failing for postgres related tests, because the
postgres database is set to the table schema.

**More details:**

In
313f6da#diff-9fcbc9dc6fc68bf0a0d62a485508779f5b1f6d749a197e0f1ab1758983d1eb1e
we added `kwargs.update({"database"` but PostgresHook did not support
`database` kwarg at that time. This is why it basically did not do
anything. Now after apache/airflow#26744 got
released in the latest provider, it is being picked up and it shows that
we wrongly set `{"database": self.table.metadata.schema}`

## What is the new behavior?

We correctly set the database to the table's database.

## Does this introduce a breaking change?

No.

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary
sunank200 pushed a commit to astronomer/astro-sdk that referenced this pull request Nov 23, 2022
# Description

## What is the current behavior?

Currently, the CI is failing for postgres related tests, because the
postgres database is set to the table schema.

**More details:**

In
313f6da#diff-9fcbc9dc6fc68bf0a0d62a485508779f5b1f6d749a197e0f1ab1758983d1eb1e
we added `kwargs.update({"database"` but PostgresHook did not support
`database` kwarg at that time. This is why it basically did not do
anything. Now after apache/airflow#26744 got
released in the latest provider, it is being picked up and it shows that
we wrongly set `{"database": self.table.metadata.schema}`

## What is the new behavior?

We correctly set the database to the table's database.

## Does this introduce a breaking change?

No.

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary

(cherry picked from commit b452f15)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:common-sql

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants