-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent node execution order by sorting node with Sequentialrunner
#1604
Consistent node execution order by sorting node with Sequentialrunner
#1604
Conversation
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice fix! I actually think the test here isn't doing what you think and needs to change, but also I'm not sure there's a good way to write a test for this at all. So might just be best overall to remove it.
RELEASE.md
Outdated
@@ -25,9 +25,11 @@ | |||
* Reduced number of log lines by changing the logging level from `INFO` to `DEBUG` for low priority messages. | |||
* Kedro's framework-side logging configuration no longer performs file-based logging. Hence superfluous `info.log`/`errors.log` files are no longer created in your project root, and running Kedro on read-only file systems such as Databricks Repos is now possible. | |||
* The `root` logger is now set to the Python default level of `WARNING` rather than `INFO`. Kedro's logger is still set to emit `INFO` level messages. | |||
* Kedro pipeline will have consistent execution order given the same set of nodes when using with `SequentialRunner`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Kedro pipeline will have consistent execution order given the same set of nodes when using with `SequentialRunner`. | |
* `SequentialRunner` now consistently runs nodes in the same order across multiple runs. |
I still don't think this is a very clear explanation though. Maybe what you have is better 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @MerelTheisenQB has a better idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is any better to be honest 😅 "Added sorting of nodes for the SequentialRunner
to facilitate consistent execution order across multiple runs. "
tests/pipeline/test_pipeline.py
Outdated
Pipeline that have multiple possible execution orders should have consistent | ||
solutions | ||
Possible Solutions: | ||
1. A -> B -> C -> D -> E -> F | ||
2. B -> A -> C -> D -> E -> F | ||
3 ... Any permutation as long as F is executed last. | ||
|
||
Although we are not sure which permutation it is, but it should always output | ||
the same permutation. | ||
|
||
A-- \ | ||
B--- \ | ||
C---- F | ||
D--- / | ||
E-- / | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice 👍
tests/pipeline/test_pipeline.py
Outdated
|
||
# Repeat 10 times so we can be sure it is not purely by chance | ||
for _ in range(10): | ||
mock_hash.return_value = random.randint(1, 1e20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is doing what you want it to do. This is currently fixing the hash of every Node
instance in this pipeline to be the same. We don't want hash(node1)
to be the same as hash(node2)
. What we should have is:
n1 = node(constant_output, None, "A")
n2 = node(constant_output, None, "B")
n3 = node(constant_output, None, "C")
n4 = node(constant_output, None, "D")
n5 = node(constant_output, None, "E")
n6 = node(multiconcat, ["A", "B", "C", "D", "E"], "F")
# You don't have to nest these, you can put them all in one with block
# But actually for Python < 3.10 it's super ugly formatting still: https://stackoverflow.com/questions/3024925/create-a-with-block-on-several-context-managers
for _ in range(10):
with mock.patch.object(n1, "__hash__", random.randint(1, 1e20)):
with mock.patch.object(n2, "__hash__", random.randint(1, 1e20)):
with mock.patch.object(n3, "__hash__", random.randint(1, 1e20)):
with mock.patch.object(n4, "__hash__", random.randint(1, 1e20)):
with mock.patch.object(n5, "__hash__", random.randint(1, 1e20)):
with mock.patch.object(n6, "__hash__", random.randint(1, 1e20)):
inverted_fork_dags = Pipeline([n1, n2, n3, n4, n5, n6])
# use inverted_fork_dags.nodes
... but this is still not a great test because the current code in main
doesn't fail it 😬
After spending a looooong time playing around with this, I think it might just not be worth writing a test for it at all all... So long as it works as it should in manual testing then I think we're fine.
Happy to explain more about what I discovered while playing around with the testing here if you want to hear. It's certainly a tricky one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's chat tomorrow!
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, and even though it would be ideal to have a test for this I recognise it isn't trivial to test this well. I'd be happy to merge this without a specific test if we're satisfied with the results from the manual test 👍
RELEASE.md
Outdated
@@ -25,9 +25,11 @@ | |||
* Reduced number of log lines by changing the logging level from `INFO` to `DEBUG` for low priority messages. | |||
* Kedro's framework-side logging configuration no longer performs file-based logging. Hence superfluous `info.log`/`errors.log` files are no longer created in your project root, and running Kedro on read-only file systems such as Databricks Repos is now possible. | |||
* The `root` logger is now set to the Python default level of `WARNING` rather than `INFO`. Kedro's logger is still set to emit `INFO` level messages. | |||
* Kedro pipeline will have consistent execution order given the same set of nodes when using with `SequentialRunner`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is any better to be honest 😅 "Added sorting of nodes for the SequentialRunner
to facilitate consistent execution order across multiple runs. "
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan nok.lam.chan@quantumblack.com
Description
#1350
Development notes
The trickiest part of the PR is actually the test. It's not straightforward to reproduce the random behaviour. If you just create 10 pipelines in the same python session. It is actually deterministic, which is different from doing 10
kedro run
.After some investigation, I believe this is the whole story
Node
has a custom__hash__
function, which actually dictated howset
order them.__hash__
is implemented. Therefore whenThe way to test this properly is likely just to do some mocking (I still need to think about how to do this)
_topologically_sorted
is the main function that gets changed, it changes the order of the arbitrary nodes to a deterministic order. The node dependencies are still respected. It only has effects when nodes don't have dependencies on each other. The new implementation will give a deterministic order to these nodes, so the run's results will be consistent.How this is tested
In the end, I go with manual testing, as turns out it is very tricky to test this non-deterministic behavior due to
hash
.For reference, this is the test that I have written originally to test out this behavior, which should help understanding what this PR is trying to do.
Checklist
RELEASE.md
file