-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bot-strategy-runner): add generalised bot runner #2851
Conversation
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
…col/protocol into chrismaree/bot-stratergy-runner
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@@ -0,0 +1,96 @@ | |||
# @uma/bot-strategy-runner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks really nasty like this in preview. take a look at the file here for something a bit easier to read.
@@ -0,0 +1,14 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample config that will pull directly from the global whitelist.
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@UMAprotocol/eng this package does not yet have any unit tests. Thus far I've been manually testing this against all main net bots. |
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
.eslintrc.js
Outdated
"@typescript-eslint/no-var-requires": 0, | ||
"@typescript-eslint/no-empty-function": 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to remove these rules globally or is it possible to just add an inline comment to ignore the rule where we violate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried adding the inline rule for a few hours but could not get it to work as expected unfortunately. I agree with you that this is not ideal to have within the global rules. I will try again with the inline stuff and report back.
"https://raw.githubusercontent.com/UMAprotocol/protocol/master/packages/affiliates/payouts/devmining-status.json" | ||
], | ||
"globalAddressBlacklist": [ | ||
"0x516f595978D87B67401DaB7AfD8555c3d28a3af4", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these addresses? Would be helpful as an example to explain whether they are EMP's or something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These ones are either expired or we dont have price feeds for them. There are some contracts we run dev mining on that we dont run feeds on. that's what these addresses are for.
How do you recommend I add comments? json does not support comments.
import { runBot } from "./BotEntryWrapper"; | ||
|
||
// Defaults for bot execution. If the user does not define these then these settings will be used. | ||
const defaultPollingDelay = 120; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not 5 mins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can change it to that to be more consistant with the hub! I set it to this as the default polling delay within the bots is 2 mins.
|
||
// Defaults for bot execution. If the user does not define these then these settings will be used. | ||
const defaultPollingDelay = 120; | ||
const defaultStrategyTimeout = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this timeout be 2 mins? I feel like 60 mins is too short, while anything over 2 mins is definitely an error. Empirically speaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye, this was mainly to match to the bots implementation. I'm hoary to make this 2 miss and then set the polling delay in to 5 mins. will be easier to reason about I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated to use 5 mins for the polling delay and 2 mins for timeout to be more conservative and consistant with the hub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these scripts are going to run bots in prod, similar to those in serverless-orchestration
, should we have unit tests? I can see us wanting to test any of the scripts in bot-strategy-runner/src
for example.
Co-authored-by: Matt Rice <matthewcrice32@gmail.com>
Co-authored-by: nicholaspai <9457025+nicholaspai@users.noreply.github.com>
sure. I can add unit tests for it but perhaps we can leave it for a separate PR as this is already pretty bloated and a lot of code. I dont think we will roll this into production just yet so it'll be ok to have untested code in master for a few days. |
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really cool, most of the complexity here is just wrangling the configs and logging, actually running the bots is really simple, which is great.
this is not a blocker, but i guess my main concern here is error handling. It seems like there's a pattern of avoiding real errors. There is probably a good reason for why you do this though, maybe its just carry over from previous design or for the sake of clean logging, but I think things would be easier to debug if we embraced throwing often. Have a top level error handler which handles formatting the error into a log, rather than doing this deep in the code or requiring logger to understand context of the log.
} | ||
|
||
// Takes in two objects of any depth and merges them. In the case of a collision in keys the second object will take priority. | ||
export function mergeConfig(...args: any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi lodash has several utilities for this, these look most closely to what you want:
this is tricky code and probably should have a unit test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye, this is a better solution! thanks for pointing me towards them. I'll replace this custom method with the lodash one.
} | ||
|
||
// Takes in an object of any structure and returns the exact same object with all strings converted to check sum format. | ||
function replaceAddressCase(object: any) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cool
allBotsConfigs, | ||
(botConfig: any) => | ||
Promise.all([ | ||
_updateProgressBar(botConfig), // As the promise progresses through bots update the progress bar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy
const responseJson = await Promise.all(responses.map((response: any) => response.json())); | ||
responseJson.forEach((contractAddressesResponse: any) => { | ||
if (contractAddressesResponse.empWhitelist) whitelist = [...whitelist, ...contractAddressesResponse.empWhitelist]; | ||
else console.log("Global Whitelist file does not have the `empWhitelist` key or is malformed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not throw error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye, you right. will throw an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well actually there was a reason for this. it is possible that the global whitelist is malformed but everything else is still fine. in that case the way it is structure here the bot will continue to work as expected, minus the ones from the whitelist.
However, this is pretty difficult to reason about so I think we should ideally throw an error like you prescribed. have updated accordingly.
try { | ||
// eslint-disable-next-line prefer-const | ||
let { timestamp, level, error, ...args } = info; | ||
if (error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way the logger could not care about the data its logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the reason it needs to do this is that the error object is of type Error
and this does not log nicely at all. If you want to store this within a json object you cant easily read it or consume it within a nested structure (such as within a GCP log). This code here basically strips out all nasty punctuation and enables this to be treated as a simple string.
I'm curious how you'd prefer this to be consumed. very open to any alternative pattern!
// Returns a promise that is resolved after `seconds` delay. Used to limit how long a spoke can run for. | ||
const _rejectAfterDelay = (seconds: number | undefined, executionConfig: any) => | ||
new Promise((resolve, _) => { | ||
setTimeout(resolve, (seconds ? seconds : defaultStrategyTimeout) * 1000, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be rejecting based on the function name, but i understand why you would rather resolve. we are seeing the logging system leaking into completely unrelated parts of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this is a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye, this is a great point and something I spent a few hours trying to find a better pattern on but ended up settling on this.
The main issue is if the promise rejects (or an error is thrown within the promise) then you cant use bluebird promises. I tried out using node async and p-limit which all worked pretty well but their error handling was really poor. Node async did not have any pattern that was workable with this (that I could find). p-limit was better as you could use promise.allSettled
which can handle errors via a rejected
status but the code was more convoluted and even harder to read. We do something like this in the serverless hub. you can check it out here
if (result.status == "rejected") { |
The pattern I ended up going for was to ensure that promises never error or reject. For example, the bot-wrapper has a try-catch block wrapping any code that could error. Then, append a standard error
prop to the response from the method and check for this after the fact. This standardizes things over all promises within this context.
My question is: how would you recommend dealing with rejected promises? what pattern would you follow if you want to handle concurrency like with a bluebird promise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I hear you. Promise.allSettled not being available in bluebird is annoying and makes this inconvenient. I think your current implementation is reasonable!
const target: any = {}; // create a new object | ||
|
||
// deep merge the object into the target object | ||
const merger = (obj: any) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why not call this method deepMerge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this method and am using lodash after Davids comment :)
// Takes in an object of any structure and returns the exact same object with all strings converted to check sum format. | ||
function replaceAddressCase(object: any) { | ||
const stringifiedObject = JSON.stringify(object); | ||
const replacedStringifiedObject = stringifiedObject.replace(/0x[a-fA-F0-9]{40}/g, toChecksumAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will cause any problems, but is it possible that this will mess with byte strings? I'm assuming that even if it does, it won't break anything. Just wanted to mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ye, it could but I think that this will only modify strings of length 40 starting with 0x. bytestrings tend to be longer at 64 chars (bytes32) with a 0x at the front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will touch any strings of length greater than 40 hex characters (since the first 40 hex characters would satisfy the regex). Regardless, I'm not too worried about checksumming random byte strings > 40 characters -- just might look a little weird in the logs.
My only point was to raise the issue in case you thought this could break something (like toChecksum erroring on non-addresses or common interpreters breaking if we paste in bytes with varied casings).
// Execute all bots in a bluebird map with limited concurrency. Note that none of the `runBot` calls ever throw errors. | ||
// Rather, they indicate an error via an `error` key within the response. The promise.race between the `runBot` and | ||
// _rejectAfterDelay bounds how long a given strategy can run for. | ||
const executionResults = await (bluebird as any).map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have to do this cast? Do the bluebird types not contain the map function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I was able to address this by changing the import type. now imported as import bluebird from "bluebird";
and it works as expected without the typecast :)
// _rejectAfterDelay bounds how long a given strategy can run for. | ||
const executionResults = await (bluebird as any).map( | ||
allBotsConfigs, | ||
(botConfig: any) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have well-defined types for the bot configs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup! good point. I have imported them and used them in this mapping.
// Construct bot configs for all enabled bots. This includes all bot specific overrides, whitelists, blacklists and any | ||
// extra settings defined in the config. See the readme for full details on what is possible with this config. | ||
const allBotsConfigs = await buildBotConfigs(globalWhiteList, strategyRunnerConfig); | ||
console.log("allBotsConfigs", allBotsConfigs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this console.log supposed to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope. have remove it!
// Returns a promise that is resolved after `seconds` delay. Used to limit how long a spoke can run for. | ||
const _rejectAfterDelay = (seconds: number | undefined, executionConfig: any) => | ||
new Promise((resolve, _) => { | ||
setTimeout(resolve, (seconds ? seconds : defaultStrategyTimeout) * 1000, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this is a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left a few minor comments.
One high-level question: what do you think we could do to bring down the number of any
declarations in the code? Just convert all the other packages to ts or are they inevitable?
Co-authored-by: Matt Rice <matthewcrice32@gmail.com>
…col/protocol into chrismaree/bot-stratergy-runner
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Thanks for your review! I have addressed all your comments/questions and nits. regarding your @mrice32 what's your opinion on testing in this package? |
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@chrismaree is the question whether it's okay to not have tests for it? Or how we should structure tests? As far as how you would structure a test, I think we could do limited testing with a sample config to prove:
However, I'm not super opinionated. What are your thoughts? |
It was mainly about the structure of the tests. I agree with your breakdown. I will leave this for a separate PR though as there is a fair amount of code already in this PR! I also think we should have a test that validates configs are built correctly in the right structure. |
Motivation
As the number of bots within the ecosystem grows so we need more scalable techniques for running bots. This PR adds a generalised bot runner that can pull addresses from a whitelist (such as the dev mining white list) and use this to execute a series of bots.
Summary
The design implemented in this PR is meant to accommodate community members running bots at home or on a VPS as well as UMA's serverless bot running infrastructure.
Details
This implementation contains a few key components:
ConfigBuilder.ts
constructs configs to execute different kinds of bots (liquidator, disputer and monitor currently supported). Contains some helper functions for building white lists and fetching remote config files.LogCaptureTransport.ts
a custom Winston transport to capture logs produced by the strategy runner.BotEntryWrapper.ts
but runner that wraps the bots entry point, enabling generalized bot execution.index.ts
file containing all the logic to pull all the parts together and all concurrency logic.For details on how to run the strategy runner (and other descriptions of the configs) please see the README.md file contained in this PR.
See this gif for what the progress bar looks like 👯
Testing