Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bot-strategy-runner): add generalised bot runner #2851

Merged
merged 21 commits into from
Apr 21, 2021

Conversation

chrismaree
Copy link
Member

@chrismaree chrismaree commented Apr 12, 2021

Motivation

As the number of bots within the ecosystem grows so we need more scalable techniques for running bots. This PR adds a generalised bot runner that can pull addresses from a whitelist (such as the dev mining white list) and use this to execute a series of bots.

Summary

The design implemented in this PR is meant to accommodate community members running bots at home or on a VPS as well as UMA's serverless bot running infrastructure.

Details

This implementation contains a few key components:

  1. ConfigBuilder.ts constructs configs to execute different kinds of bots (liquidator, disputer and monitor currently supported). Contains some helper functions for building white lists and fetching remote config files.
  2. LogCaptureTransport.ts a custom Winston transport to capture logs produced by the strategy runner.
  3. BotEntryWrapper.ts but runner that wraps the bots entry point, enabling generalized bot execution.
  4. index.ts file containing all the logic to pull all the parts together and all concurrency logic.

For details on how to run the strategy runner (and other descriptions of the configs) please see the README.md file contained in this PR.

See this gif for what the progress bar looks like 👯

Testing

  • Ran end-to-end test, running the code as in production
  • New unit tests created
  • Existing tests adequate, no new tests required
  • All existing tests pass
  • Untested

Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
…col/protocol into chrismaree/bot-stratergy-runner
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@@ -0,0 +1,96 @@
# @uma/bot-strategy-runner
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks really nasty like this in preview. take a look at the file here for something a bit easier to read.

@@ -0,0 +1,14 @@
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample config that will pull directly from the global whitelist.

Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@chrismaree chrismaree marked this pull request as ready for review April 13, 2021 15:13
@chrismaree chrismaree requested review from mrice32 and nicholaspai and removed request for mrice32 April 13, 2021 15:13
@chrismaree
Copy link
Member Author

@UMAprotocol/eng this package does not yet have any unit tests. Thus far I've been manually testing this against all main net bots.

@chrismaree chrismaree requested review from mrice32 and daywiss April 14, 2021 08:36
.eslintrc.js Outdated
Comment on lines 30 to 31
"@typescript-eslint/no-var-requires": 0,
"@typescript-eslint/no-empty-function": 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove these rules globally or is it possible to just add an inline comment to ignore the rule where we violate it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding the inline rule for a few hours but could not get it to work as expected unfortunately. I agree with you that this is not ideal to have within the global rules. I will try again with the inline stuff and report back.

packages/bot-strategy-runner/README.md Outdated Show resolved Hide resolved
"https://raw.githubusercontent.com/UMAprotocol/protocol/master/packages/affiliates/payouts/devmining-status.json"
],
"globalAddressBlacklist": [
"0x516f595978D87B67401DaB7AfD8555c3d28a3af4",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these addresses? Would be helpful as an example to explain whether they are EMP's or something else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones are either expired or we dont have price feeds for them. There are some contracts we run dev mining on that we dont run feeds on. that's what these addresses are for.

How do you recommend I add comments? json does not support comments.

import { runBot } from "./BotEntryWrapper";

// Defaults for bot execution. If the user does not define these then these settings will be used.
const defaultPollingDelay = 120;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 5 mins?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can change it to that to be more consistant with the hub! I set it to this as the default polling delay within the bots is 2 mins.


// Defaults for bot execution. If the user does not define these then these settings will be used.
const defaultPollingDelay = 120;
const defaultStrategyTimeout = 60;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this timeout be 2 mins? I feel like 60 mins is too short, while anything over 2 mins is definitely an error. Empirically speaking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, this was mainly to match to the bots implementation. I'm hoary to make this 2 miss and then set the polling delay in to 5 mins. will be easier to reason about I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to use 5 mins for the polling delay and 2 mins for timeout to be more conservative and consistant with the hub.

Copy link
Member

@nicholaspai nicholaspai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these scripts are going to run bots in prod, similar to those in serverless-orchestration, should we have unit tests? I can see us wanting to test any of the scripts in bot-strategy-runner/src for example.

chrismaree and others added 2 commits April 15, 2021 10:15
Co-authored-by: Matt Rice <matthewcrice32@gmail.com>
Co-authored-by: nicholaspai <9457025+nicholaspai@users.noreply.github.com>
@chrismaree
Copy link
Member Author

Since these scripts are going to run bots in prod, similar to those in serverless-orchestration, should we have unit tests? I can see us wanting to test any of the scripts in bot-strategy-runner/src for example.

sure. I can add unit tests for it but perhaps we can leave it for a separate PR as this is already pretty bloated and a lot of code. I dont think we will roll this into production just yet so it'll be ok to have untested code in master for a few days.

Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Copy link
Contributor

@daywiss daywiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool, most of the complexity here is just wrangling the configs and logging, actually running the bots is really simple, which is great.

this is not a blocker, but i guess my main concern here is error handling. It seems like there's a pattern of avoiding real errors. There is probably a good reason for why you do this though, maybe its just carry over from previous design or for the sake of clean logging, but I think things would be easier to debug if we embraced throwing often. Have a top level error handler which handles formatting the error into a log, rather than doing this deep in the code or requiring logger to understand context of the log.

}

// Takes in two objects of any depth and merges them. In the case of a collision in keys the second object will take priority.
export function mergeConfig(...args: any) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi lodash has several utilities for this, these look most closely to what you want:

this is tricky code and probably should have a unit test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, this is a better solution! thanks for pointing me towards them. I'll replace this custom method with the lodash one.

}

// Takes in an object of any structure and returns the exact same object with all strings converted to check sum format.
function replaceAddressCase(object: any) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cool

allBotsConfigs,
(botConfig: any) =>
Promise.all([
_updateProgressBar(botConfig), // As the promise progresses through bots update the progress bar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fancy

const responseJson = await Promise.all(responses.map((response: any) => response.json()));
responseJson.forEach((contractAddressesResponse: any) => {
if (contractAddressesResponse.empWhitelist) whitelist = [...whitelist, ...contractAddressesResponse.empWhitelist];
else console.log("Global Whitelist file does not have the `empWhitelist` key or is malformed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not throw error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, you right. will throw an error here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well actually there was a reason for this. it is possible that the global whitelist is malformed but everything else is still fine. in that case the way it is structure here the bot will continue to work as expected, minus the ones from the whitelist.

However, this is pretty difficult to reason about so I think we should ideally throw an error like you prescribed. have updated accordingly.

try {
// eslint-disable-next-line prefer-const
let { timestamp, level, error, ...args } = info;
if (error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way the logger could not care about the data its logging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the reason it needs to do this is that the error object is of type Error and this does not log nicely at all. If you want to store this within a json object you cant easily read it or consume it within a nested structure (such as within a GCP log). This code here basically strips out all nasty punctuation and enables this to be treated as a simple string.

I'm curious how you'd prefer this to be consumed. very open to any alternative pattern!

// Returns a promise that is resolved after `seconds` delay. Used to limit how long a spoke can run for.
const _rejectAfterDelay = (seconds: number | undefined, executionConfig: any) =>
new Promise((resolve, _) => {
setTimeout(resolve, (seconds ? seconds : defaultStrategyTimeout) * 1000, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be rejecting based on the function name, but i understand why you would rather resolve. we are seeing the logging system leaking into completely unrelated parts of the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this is a bit confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, this is a great point and something I spent a few hours trying to find a better pattern on but ended up settling on this.

The main issue is if the promise rejects (or an error is thrown within the promise) then you cant use bluebird promises. I tried out using node async and p-limit which all worked pretty well but their error handling was really poor. Node async did not have any pattern that was workable with this (that I could find). p-limit was better as you could use promise.allSettled which can handle errors via a rejected status but the code was more convoluted and even harder to read. We do something like this in the serverless hub. you can check it out here

if (result.status == "rejected") {

The pattern I ended up going for was to ensure that promises never error or reject. For example, the bot-wrapper has a try-catch block wrapping any code that could error. Then, append a standard error prop to the response from the method and check for this after the fact. This standardizes things over all promises within this context.

My question is: how would you recommend dealing with rejected promises? what pattern would you follow if you want to handle concurrency like with a bluebird promise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hear you. Promise.allSettled not being available in bluebird is annoying and makes this inconvenient. I think your current implementation is reasonable!

const target: any = {}; // create a new object

// deep merge the object into the target object
const merger = (obj: any) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not call this method deepMerge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this method and am using lodash after Davids comment :)

// Takes in an object of any structure and returns the exact same object with all strings converted to check sum format.
function replaceAddressCase(object: any) {
const stringifiedObject = JSON.stringify(object);
const replacedStringifiedObject = stringifiedObject.replace(/0x[a-fA-F0-9]{40}/g, toChecksumAddress);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will cause any problems, but is it possible that this will mess with byte strings? I'm assuming that even if it does, it won't break anything. Just wanted to mention it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, it could but I think that this will only modify strings of length 40 starting with 0x. bytestrings tend to be longer at 64 chars (bytes32) with a 0x at the front.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will touch any strings of length greater than 40 hex characters (since the first 40 hex characters would satisfy the regex). Regardless, I'm not too worried about checksumming random byte strings > 40 characters -- just might look a little weird in the logs.

My only point was to raise the issue in case you thought this could break something (like toChecksum erroring on non-addresses or common interpreters breaking if we paste in bytes with varied casings).

packages/bot-strategy-runner/src/index.ts Outdated Show resolved Hide resolved
// Execute all bots in a bluebird map with limited concurrency. Note that none of the `runBot` calls ever throw errors.
// Rather, they indicate an error via an `error` key within the response. The promise.race between the `runBot` and
// _rejectAfterDelay bounds how long a given strategy can run for.
const executionResults = await (bluebird as any).map(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have to do this cast? Do the bluebird types not contain the map function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise I get this error:
image

I have installed "@types/bluebird" so I'm not sure why this is the case. will do some more playing

Copy link
Member Author

@chrismaree chrismaree Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I was able to address this by changing the import type. now imported as import bluebird from "bluebird"; and it works as expected without the typecast :)

// _rejectAfterDelay bounds how long a given strategy can run for.
const executionResults = await (bluebird as any).map(
allBotsConfigs,
(botConfig: any) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have well-defined types for the bot configs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup! good point. I have imported them and used them in this mapping.

// Construct bot configs for all enabled bots. This includes all bot specific overrides, whitelists, blacklists and any
// extra settings defined in the config. See the readme for full details on what is possible with this config.
const allBotsConfigs = await buildBotConfigs(globalWhiteList, strategyRunnerConfig);
console.log("allBotsConfigs", allBotsConfigs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this console.log supposed to be here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope. have remove it!

// Returns a promise that is resolved after `seconds` delay. Used to limit how long a spoke can run for.
const _rejectAfterDelay = (seconds: number | undefined, executionConfig: any) =>
new Promise((resolve, _) => {
setTimeout(resolve, (seconds ? seconds : defaultStrategyTimeout) * 1000, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this is a bit confusing.

Copy link
Member

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Left a few minor comments.

One high-level question: what do you think we could do to bring down the number of any declarations in the code? Just convert all the other packages to ts or are they inevitable?

chrismaree and others added 5 commits April 16, 2021 10:10
Co-authored-by: Matt Rice <matthewcrice32@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
…col/protocol into chrismaree/bot-stratergy-runner
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@chrismaree
Copy link
Member Author

Looks great! Left a few minor comments.

One high-level question: what do you think we could do to bring down the number of any declarations in the code? Just convert all the other packages to ts or are they inevitable?

Thanks for your review! I have addressed all your comments/questions and nits.

regarding your any comment you right that the main thing is just converted things in ts and declaring more interfaces.

@mrice32 what's your opinion on testing in this package?

Signed-off-by: Christopher Maree <christopher.maree@gmail.com>
@mrice32
Copy link
Member

mrice32 commented Apr 20, 2021

Looks great! Left a few minor comments.
One high-level question: what do you think we could do to bring down the number of any declarations in the code? Just convert all the other packages to ts or are they inevitable?

Thanks for your review! I have addressed all your comments/questions and nits.

regarding your any comment you right that the main thing is just converted things in ts and declaring more interfaces.

@mrice32 what's your opinion on testing in this package?

@chrismaree is the question whether it's okay to not have tests for it? Or how we should structure tests?

As far as how you would structure a test, I think we could do limited testing with a sample config to prove:

  1. That the runner is reading the config correctly.
  2. That the runner acts as expected on error.

However, I'm not super opinionated. What are your thoughts?

@chrismaree
Copy link
Member Author

Looks great! Left a few minor comments.
One high-level question: what do you think we could do to bring down the number of any declarations in the code? Just convert all the other packages to ts or are they inevitable?

Thanks for your review! I have addressed all your comments/questions and nits.
regarding your any comment you right that the main thing is just converted things in ts and declaring more interfaces.
@mrice32 what's your opinion on testing in this package?

@chrismaree is the question whether it's okay to not have tests for it? Or how we should structure tests?

As far as how you would structure a test, I think we could do limited testing with a sample config to prove:

  1. That the runner is reading the config correctly.
  2. That the runner acts as expected on error.

However, I'm not super opinionated. What are your thoughts?

It was mainly about the structure of the tests. I agree with your breakdown. I will leave this for a separate PR though as there is a fair amount of code already in this PR! I also think we should have a test that validates configs are built correctly in the right structure.

@chrismaree chrismaree merged commit a748107 into master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants