Description
Motivation
Some features we would like to add as nodes are inherently non-deterministic. This includes relatively simple things like "what's the current time?", "how many times has the user run this chain?", and "give me a random number." These example, while very simple, are very useful, so it would be nice if we could add them.
The problem with non-deterministic nodes is that they break chaiNNer. Currently, pretty much everything assumes that nodes are deterministic, and even more systems will make that assumption in the future. Here are some things that non-deterministic nodes break right now or will break in the future:
- The current chain executor assumes that all nodes are deterministic and uses this assumption to cache nodes.
- Trivial optimizations (such as iterator outlining) aren't correct anymore.
- Type broadcasts aren't correct, because a static type system requires determinism to propagate types.
- Live execution (planned feature) relies on caching intermediate results from previous runs, and non-deterministic nodes can't be cached.
Description
Find a way to support non-deterministic nodes without breaking chaiNNer. This includes answering the following questions:
- What are the semantics of the outputs of non-deterministic nodes?
This is mostly about what down-stream nodes see. E.g.R
be a non-deterministic node that generates a random number, and letR
be connected to 2 nodesA
andB
like so:DoR ─> A └──> B
A
andB
see the same random number? If so, why would this change ifB
is inside an iterator? - How do these semantics interact with caching and optimization?
- Are there non-deterministic nodes that are acceptable to cache, e.g. for live execution? If so, what property do they have that makes it acceptable?