Skip to content

pyiron_workflow-independent dict containing workflow info #576

Open
@samwaseda

Description

@samwaseda

Note: I will work on this myself. You can just tell me what you think @liamhuber

Currently, we have graph_as_dict, which returns a dict containing all the information about the workflow. However, when we try to make semantikon work with any workflow manager, the most natural thing for me is to ask every workflow manager to provide a dictionary that does not require semantikon to depend on the workflow manager. In addition, that would be consistent with the effort that @jan-janssen is making here.

I was initially thinking we should extend @jan-janssen's universal representation of workflows by adding optional args required by semantikon, and potentially also allowing nodes_dict to contain sub-nodes to represent also the Macro nodes. However, I started realizing that this might not be the right way, because:

  • There's no consensus on how macro nodes or metadata should be defined
  • (Maybe more importantly) @jan-janssen's universal representation specifies the numerical logic of execution and in this regard it is conceptually complete

So my first suggestion is to separate semantikon-dict from the universal representation dict, in order to represent the workflow including the scientific logic.

Coming to the actual content, semantikon would need:

  • IO-values
  • Edges
  • Unique node labels
  • IO labels
  • Underlying python function
  • (Optionally) type hints

The type hints are optional, because semantikon can parse them, but that could be a bit dangerous because it might fail to assign the types to the IO labels. The reason why we need the original python function is because since this PR the function metadata is attached to f._semantikon_metadata, which I would not ask the workflow manager to provide.

Concerning the Macro nodes, I was testing the following code:

@Workflow.wrap.as_function_node
def add_one(a: int):
    result = a + 1
    return result

@Workflow.wrap.as_function_node
def add_two(b: int):
    result = b + 2
    return result

@Workflow.wrap.as_macro_node
def add_three(macro, c: int):
    macro.one = add_one(a=c)
    macro.two = add_two(b=macro.one)
    result = macro.two
    return result

wf = Workflow("my_wf")
wf.three = add_three(c=1)
wf.four = add_one(a=wf.three)
wf.run()

Then I realized that wf.graph_as_dict does explicitly not provide the connection between wf.three.inputs.c and wf.three.one.inputs.a as well wf.three.two.outputs.result and wf.three.outputs.result. They may not be edges in the sense of workflow edges, but they still somehow have to be included in the dict.

Well, thanks for reading, but actually I'm probably the one who profited from it the most as it helped me figure out what I actually want XD

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions