Description
Note: I will work on this myself. You can just tell me what you think @liamhuber
Currently, we have graph_as_dict
, which returns a dict containing all the information about the workflow. However, when we try to make semantikon
work with any workflow manager, the most natural thing for me is to ask every workflow manager to provide a dictionary that does not require semantikon
to depend on the workflow manager. In addition, that would be consistent with the effort that @jan-janssen is making here.
I was initially thinking we should extend @jan-janssen's universal representation of workflows by adding optional args required by semantikon
, and potentially also allowing nodes_dict
to contain sub-nodes to represent also the Macro
nodes. However, I started realizing that this might not be the right way, because:
- There's no consensus on how macro nodes or metadata should be defined
- (Maybe more importantly) @jan-janssen's universal representation specifies the numerical logic of execution and in this regard it is conceptually complete
So my first suggestion is to separate semantikon
-dict from the universal representation dict, in order to represent the workflow including the scientific logic.
Coming to the actual content, semantikon
would need:
- IO-values
- Edges
- Unique node labels
- IO labels
- Underlying python function
- (Optionally) type hints
The type hints are optional, because semantikon
can parse them, but that could be a bit dangerous because it might fail to assign the types to the IO labels. The reason why we need the original python function is because since this PR the function metadata is attached to f._semantikon_metadata
, which I would not ask the workflow manager to provide.
Concerning the Macro
nodes, I was testing the following code:
@Workflow.wrap.as_function_node
def add_one(a: int):
result = a + 1
return result
@Workflow.wrap.as_function_node
def add_two(b: int):
result = b + 2
return result
@Workflow.wrap.as_macro_node
def add_three(macro, c: int):
macro.one = add_one(a=c)
macro.two = add_two(b=macro.one)
result = macro.two
return result
wf = Workflow("my_wf")
wf.three = add_three(c=1)
wf.four = add_one(a=wf.three)
wf.run()
Then I realized that wf.graph_as_dict
does explicitly not provide the connection between wf.three.inputs.c
and wf.three.one.inputs.a
as well wf.three.two.outputs.result
and wf.three.outputs.result
. They may not be edges in the sense of workflow edges, but they still somehow have to be included in the dict.
Well, thanks for reading, but actually I'm probably the one who profited from it the most as it helped me figure out what I actually want XD