-
Notifications
You must be signed in to change notification settings - Fork 0
Description
General Guidance on Style, Performance, Sanity
- Avoid
globalvariables at all costs. All functions should be scoped to only their data inputs, and the variables defined within. Leaking data outside of functions makes the code harder to debug, compose, and refactor. - Use Python's gradual typing system to add type signatures to all functions, arguments, and returns. This will help a lot with modularization and your mental clarity while coding. See: https://mypy-lang.org/ for how to use it!
- Unless returning a closure, don't nest functions inside of functions.
- Use runtime memory sparingly - loading tons of results into a list is not ideal because it ultimately limits how many runs we can make on the model. Since Monte Carlo relies on the law of large numbers, we want to be able to support large numbers. Let's take advantage of Python's generators and file buffers on the source and sink parts of the model, so data is not flowing into memory.
- Any distributions requiring a calculation should be done only once, either ahead of time, or at the beginning of the modeling cycle. I believe in the current implementation, the distribution of prices is being computed again on every run.
On Folds
Fold is a powerful concept from the functional programming domain. In OCaml the type signature of fold_left is defined as follows. OCaml lists are all linked so the definition of fold is recursive and as such the fold implementations are different for the left and right sides. You can ignore the idea of "left" because in Python the lists are not linked lists but rather arrays. Also in Python the same can be achieved with the reduce operator if you always specify the optional argument for the starting piece of data.
fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'aIn general, folds are defined for "list-like" structures or streams that have 0 or many elements. If we break down the signature, fold_left is a higher order function that takes 3 arguments:
- A folding function
'a -> 'b -> 'awhich combines something of type'awith another thing of type'bto create a new'a. This function defines each "step" of the fold. As it encounters the new'b, it transforms them to'ausing the previously computed'a. - An initial element of type
'a. We need this to account for the case where the'b listis empty. We also need it to begin the fold procedure on the first element, since no prior state has been computed. If we imagine that'a = 'b = intand we want to use a fold to perform a sum of a list, the initial element is 0. Whatever the target type of the fold is, the initial element has this type. - The list of
'bto fold. Again this can be a stream or other iterable structure too, like a generator in Python.
Folds in Protocol Modeling
Smart contracts in the abstract are state machines. A contract is initialized with a starting state, and a number of instructions cause its state to change. So if you imagine a stream of instructions being the thing to fold, and the contract's empty state as the target type, we can construct a modeling engine for a smart contract as a specific case of fold.
state = { ... } // the internal data state of the protocol
instruction = A | B | C // enumeration of possible state changing methods
model: (state -> instruction -> state) -> state -> instruction list -> stateWe can further evolve this approach if we want to model each step in the fold as a "day", by changing the instruction type to be a piece of data that may contain multiple instructions based on probabilities. Alternatively, a day can simply be a stream of instructions itself that gets concatenated to the previous day's instructions.
Recommendations for Model Refactor
- Separate distributions of events into their own module, and compute their values only once if they are data based. Otherwise you can use the existing configuration to feed as an argument a function that creates the distribution.
- Define a data structure encapsulating the model's state, a constructor for the initial state, and methods that can update the state. Try to keep things as immutable as possible, copying the state with new data instead of mutating it.
- Define the instructions we want to be able to process (e.g. MintHyUSD, MintLevSOL, UpdateNAV) as an enum. Define handlers that take the necessary inputs (e.g. NAV updates require new collateral price, minting action requires the amount of collateral) and output a new state. Each handler of an instruction should be a function of the form
(inputs..., state) -> state. - Redefine the main loop of the model as a fold. Define a folding function that can be passed in. This function can pattern match on each instruction and call the corresponding handler. I believe Python supports enums and pattern matching now, so that will help a lot.
- Determine how to create a stream of instructions "per day", using the distributions, and emit them as separate
yieldlines in a generator. Take a look into how Python's generators work, they are lazy functions that continually create outputs. I believe you will be able to create an array of generators (one per day) and combine them all into one big generator to feed into the fold. - We'll need to figure out a way to emit the output of each step in the fold to a file so we're storing it in memory. It would be cool to have a tabulation of every state update in a file for analysis.
- Separate charting into its own module that is not connected to the model. The charts should just expect to read data and make charts, not be part of the modeling.