-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline & simplify __eq__ methods in models Dag and BaseOperator #13449
Conversation
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
- Use getattr() instead of __dict__ as __dict__ doesn't return correct values for properties. - Avoid unnecessary condition checks (the removed condition checks are covered by _comps)
18e6c10
to
3c1260e
Compare
@@ -361,8 +361,7 @@ def __repr__(self): | |||
return f"<DAG: {self.dag_id}>" | |||
|
|||
def __eq__(self, other): | |||
if type(self) == type(other) and self.dag_id == other.dag_id: | |||
|
|||
if type(self) == type(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can remember, it's optimization so we can check the most common case much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That optimisation may not make sense given two reasons below:
all()
function has the "exit fast" property, i.e., whenever there is anyFalse
element, it will returnFalse
immediately, rather than traversing all elements in the iterable. reference- For
dag
model,dag_id
is the 1st element in_comps
; ForBaseOperator
,task_id
is the 1st element in_comps
.
So after the change I make here, there should be zero impact on the performance (actually it improves the performance very minorly: it helps avoid comparing dag_id
in dag
model's __eq__
for two times. Similar for BaseOperator
).
Kindly let me know if it makes sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Change-1: Use
getattr()
instead of dict as dict doesn't return correct values for properties (sample code reproducing the issue is given below). This was fixed fordag
model, but was not fixed forbaseoperator
model.Change-2: Avoid unnecessary condition check (the removed condition checks are covered by
_comps
)Sample Code Reproducing Issue in Change-1
Output:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.