Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for flows with intersecting branches #182

Open
miloszbednarzak opened this issue Apr 14, 2020 · 5 comments
Open

Support for flows with intersecting branches #182

miloszbednarzak opened this issue Apr 14, 2020 · 5 comments

Comments

@miloszbednarzak
Copy link

I'd like to implement flow in which its branches intersects with each other like:
test-graph

Code:
from metaflow import FlowSpec, step


class TestFlow(FlowSpec):
    @step
    def start(self):
        self.next(self.a, self.b)

    @step
    def a(self):

        self.next(self.a1, self.a2)

    @step
    def a1(self):

        self.next(self.a1b1)

    @step
    def a2(self):

        self.next(self.a2b2)

    @step
    def b(self):

        self.next(self.b1, self.b2)

    @step
    def b1(self):

        self.next(self.a1b1)

    @step
    def b2(self,):

        self.next(self.a2b2)

    @step
    def a1b1(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.join)

    @step
    def a2b2(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.join)

    @step
    def join(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.end)

    @step
    def end(self):
        pass


if __name__ == "__main__":
    TestFlow()

It throws error:

Step a1b1 joins steps from unrelated splits. Ensure that there is a matching join for every split.

I know I can reimplement this like:
test-graph2

Code:
from metaflow import FlowSpec, step


class TestFlow(FlowSpec):
    @step
    def start(self):
        self.next(self.a, self.b)

    @step
    def a(self):

        self.next(self.a1, self.a2)

    @step
    def a1(self):

        self.next(self.a12)

    @step
    def a2(self):

        self.next(self.a12)

    @step
    def b(self):

        self.next(self.b1, self.b2)

    @step
    def b1(self):

        self.next(self.b12)

    @step
    def b2(self,):

        self.next(self.b12)

    @step
    def a12(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.as_bs)

    @step
    def b12(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.as_bs)

    @step
    def as_bs(self, inputs):
        self.merge_artifacts(inputs)
        self.next(self.end)

    @step
    def end(self):
        pass


if __name__ == "__main__":
    TestFlow()

In my case looking from a perspective of visual graph second implementation looks cleaner, but from implementation perspective it brings unnecessary additional steps.

Are there any plans to add support for this kind of Flows?
Thanks!

@savingoyal
Copy link
Collaborator

@miloszbednarzak I am curious what would be the use case for the first graph? The reason we lean for graphs of later nature is that it's easier to reason about their execution characteristics.

@miloszbednarzak
Copy link
Author

miloszbednarzak commented Apr 21, 2020

@savingoyal Let's say that functions in steps a1 and b1 outputs variables of kind one, and a2,b2 accordingly. I wanted in step a1b1 and a2b2 to group all data coming from input branches by their kinds.
So if the first option was possible I could do something like that:

# in a1b1 step
ones_data = [input.one_kind for input in inputs]
# in a2b2 step
twos_data = [input.two_kind for input in inputs]

In second, legitimate variant I need to group them in as_bs step like this:

ones_data = [self.a1_data, self.b1_data]
twos_data = [self.a2_data, self.b2_data]

In this toy example there is not much difference, but my actual graph is much more complex and I'd like to group data by kind not by using their artefact variable name in grouping operation, but by pointing output to the step which will group all inputs automatically.

@savingoyal
Copy link
Collaborator

Makes sense. We are discussing a number of enhancements to the flow structure, including introducing the notion of sub workflows, but the timing is TBD.

@valayDave
Copy link
Collaborator

@savingoyal : Are you guys planning to add dynamic DAG support ?

@savingoyal
Copy link
Collaborator

@valayDave The exact implementation, as well as UX for sub-workflows, is TBD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants