Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add op-deployer design #71

Merged
merged 3 commits into from
Sep 16, 2024
Merged

Add op-deployer design #71

merged 3 commits into from
Sep 16, 2024

Conversation

mslipper
Copy link
Contributor

@mslipper mslipper commented Sep 3, 2024

Adds a design doc for op-deployer, a standalone chain deployment tool.

Adds a design doc for `op-deployer`, a standalone chain deployment tool.

To fix this, we will:

- Remove the L1 genesis variables from the deploy config. Deploying a new l1 should be a separate process from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@protolambda split up the deploy config into themes already but did it in a backwards compatible way so that there are no breaking changes to the JSON. Would it be useful to have a migration tool, like input legacy deploy config and output a new split up deploy config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly do this, but IMO we should just migrate everything all at once to the new format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, but realistically the deploy config still has tech-debt, in naming of attributes and other ways. I would center the op-deployer around OPSM configs

- Remove legacy config variables that are no longer used, like `l1UseClique` or `deploymentWaitConfirmations`.
- Split the deploy config into multiple stanzas, so that irrelevant config variables can be ignored. For example,
the DA configuration must be specified right now for the config to validate, even when it's not used by a given chain.
- Remove booleans that are used to enable/disable features. This data belongs inside of the deployment intent, not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is interesting - note that this will require a decent refactor to the way the smart contracts work. It sounds like we will need to modularize some functionality so that it can be composed together and have different entrypoints rather than 1 entrypoint with a config with branching logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modularize some functionality so that it can be composed together and have different entrypoints rather than 1 entrypoint with a config with branching logic

Yes 100%. I'm happy to help with this as part of this project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of making the config / intent modular, rather than putting feature-booleans in a huge config

## Goals

- Provide a canonical way to deploy chains.
- Enable deploying multiple L2s to a single L1 as part of a single logical Superchain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about "Deploying multiple superchains to a single L1" as another goal? For example the devnet and testnet both have superchains on L1 Sepolia, so this should help with spinning up devnets. It also helps validate the architecture is sufficiently modular

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tooling should support that use case. I imagine you'd do that by defining another deployment intent, and running that through the tooling. Is there a use case for defining multiple superchains and multiple L2s in the same deployment intent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on what exactly a deployment intent is and what execution of it does. Can it be scoped to a single step of the pipeline? What part of the intent file maps to a pipeline step, or is it one intents file per step?

Either way, yes a separate invocation with a different deployment intent to support this seems ok to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A deployment intent is a declarative way of describing how your deployment will look. The tooling then looks at the intent and decides what changes it needs to make on L1/l2 to make the real deployment match the intent. It's like Terraform, but for blockchains. I can clarify this more in the doc.

To your questions specifically:

Can it be scoped to a single step of the pipeline?

No, the deployer would decide which steps it needs to run based on the changes it sees to the deployment intent.

What part of the intent file maps to a pipeline step, or is it one intents file per step?

There is one intent per file, defined in the [intent] stanza.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will op-deployer support incremental deployments (i.e., only deploy the minimal set of components that were modified as a result of an intent mutation)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it will.

Comment on lines +24 to +26
- **opc** deploys new chains to our cloud infrastructure.
- The **compose devnet** uses a bunch of custom Python tooling to deploy a chain locally using `docker-compose`.
- The **Kurtosis devnet** uses a combination of custom deployer scripts and the fault proof's `getting-started.sh` to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add links to these tools for people (like me) that aren't familiar with all of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Will update, though opc is in a private repo of ours so it won't be very useful for non-Labs folks.


```toml
[intent]
l1ChainID = 11_155_111
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't want to get too into the weeds of input file structure here, but we may want to include RPC URL too, since sometimes chain ID 11_155_111 might be L1 sepolia, but other times it might be a local fork

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should separate the RPC URL, since that's more the environment that the deployer uses than an inherent property of the deployment itself.

ecosystem/op-deployer.md Outdated Show resolved Hide resolved
Comment on lines +62 to +63
deploy. `op-deployer` will then diff the intent against the current state of the deployment, and run the pipeline to
ensure that the real deployment matches the intent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff the intent against the current state of the deployment

This part is throwing me off a bit, when in the pipeline is a diff occurring and what exactly are we diffing against?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's diffing its internal state against what's actually deployed. For example, the L1 deployer step would check to see if the contracts are actually deployed to L1 before executing. If they aren't, it'll perform the deployment.


To fix this, we will:

- Remove the L1 genesis variables from the deploy config. Deploying a new l1 should be a separate process from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the L1 genesis variables from the deploy config

Are deploy configs still a thing as part of op-deployer, or are we replacing deploy config with the intent input files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'll be a thing for the time being. In the future I'd like to modularize things so that we can move away from a single huge monolithic config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: The intents file + its state will replace the day-to-day usage of the deployment config. Some tools (like the deploy scripts) will continue to use the deploy config until we refactor to something else.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the deploy configs (or other key chain artifacts) are modified, we'll want to be careful about how that may impact / break the Superchain Registry. I recommend we asses this impact early / upfront.

deploying and upgrading OP Chains remains a complicated and frustrating process. Users are directed to a tutorial
with [18 manual steps](https://docs.optimism.io/builders/chain-operators/tutorials/create-l2-rollup) to follow in order
to create their own testnet. Within Labs, we use a combination of different tools depending on where the chain is to be
hosted:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the superchain-ops-team uses some variant of the deploy-scripts, with some instrumentation for Safe contracts / deploy-tx reliability. I'm not entirely familiar with it, but I think it should be in this list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll track this down.

Comment on lines +51 to +52
that each perform a single piece of the deployment. The stages each take an input file, and enrich it with the data
pertaining to that particular stage of the deployment. This allows the output of upstream stages to be cached.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that after deploying a L2 to a L1, by updating the L1 genesis.json, we end up modifying the L1 genesis block-hash, which affects the L2 genesis of any prior deployed L2. So we have to interleave L2 deployments by first doing all the L1 parts, then "freeze" the L1 genesis, and then all the L2 genesis parts.

So I think we need the concept of a "finalized" config, and a config that can still mutate in later pipeline steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean that the L1 genesis.json changes? Are you referring to the case where you try to deploy a bunch of chains in the L1's genesis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we want to pre-deploy multiple L2s to the same L1, merging them into a unified genesis state, then that needs to wrap up before we can create any of the L2 rollup configs, due to the dependency on the L1 genesis block hash. So it's important to separate the "L2 deploy" into a "deploy L2 contracts to L1" and a "create L2 genesis" step, where the genesis steps for each L2 can run after the L1 state is final.

Comment on lines +87 to +88
Under the hood, `op-deployer` will run through the deployment pipeline. The output of each intermediate stage will
be stored by default within the intent's TOML file, however the implementation of the "state store" will be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't make configs self-mutating. Separate inputs and outputs make caching, maintainability, etc. much easier. One of my biggest regrets about the current DeployConfig is that the deployments addresses get looped back into the config struct. Cyclic dependencies are hell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree. There will be no cyclic dependencies. The idea is that the config only gets appended to - each stanza of the state is its own immutable thing, separate from all the others. The only reason they're being put in the same file is so that we have a place to store that state offline between subsequent runs of the tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's helpful I can have the state and the intent be separate files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the config is incremental, like a list of things where we only append to, I'm fine with it. But sparse changes to an existing config quickly become an hard to track state-machine, which adds more complexity.


![](./op-deployer/pipeline.png)

## Deployment Intents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions:

  1. Will a deployment intent have a standard interface of some kind?
  2. Will a deployment intent be executable from op-e2e?
  3. Will a deployment intent be coupled to a strict pre-state? In particular, is an intent to deploy X to a local L1 genesis (aka statedump) different from an intent to produce a list of template deployment txs (broadcast output)?
  4. Will every intent run with the forge tooling? Or can there be intents that are not forge?
  5. How does an "interrupted deployment" get recovered? One of the pain-points product hit is that they run forge, then have to CTRL+C because of gas spike issues, and then have to start over from the beginning.
  6. Will we integrate the source-code verify step? (API calls to etherscan/blockscout). Ideally I think the deployment-tools produce a dump, such that the verification can happen later, in case a new explorer comes online.
  7. Does this tool include L2Genesis?
  8. Will intents have defaults? Maybe defaults sourced from the superchain-registry?
  9. How does a pipeline execute, when it involves intermittent pauses for e.g. deployment-tx signing? Does it persist, and continue from where it left of, after a signing phase maybe?
  10. Can we a add a --noninteractive kind of flag, that auto-signs with some hotkey, for testing purposes?

Personally I'd also suggest running op-deployer intents fully in Go, or at least having some flexibility to do so.
Otherwise each intent will rely on sub-process calls to forge, intermediate config files and output files, and many extra consistency checks.
With forge-scripts in Go, it can be packaged up, and do exactly as the intent is programmed to, without interacting with anything outside of the process, other than an artifacts tarball. I think this makes it much more reproducible in different environments.
The drawback however is that we don't use standard forge. Maybe with the right abstraction it's interchangeable, and forge is version-checked, inputs are all in unique temporary directories, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answers below:

  1. Yes - my strawman interface is the one in the [intent] stanza in the design doc.
  2. Yes - since each deployment stage is a Go function, it'll be easy to call from op-e2e.
  3. The idea is that it won't matter where you deploy the chain to. In practice we'll have to implement different "backends" in the deployment tool to support both statedump and deployed L1s.
  4. No, intents can run without Forge. I'd actually like to move away from shelling out to Forge whenever possible.
  5. Stages commit to the intents TOML to store their state periodically. This allows for the pipeline to recover halfway through.
  6. Not right now, but we can add tooling to do this in the future or produce a dump like you said.
  7. I'd like it to leverage the Go forge-scripts tooling you built to obviate the need for the L2Genesis Forge script.
  8. Yes, intents will have defaults. The default config is the standard config. This can be defined in the registry.
  9. It will persist its state inside the intents TOML to allow for this kind of behavior.
  10. Sure!

In general I agree that we want to run the intents fully in Go. Some stuff requires shelling out to Forge right now, but in the future I'd like to refactor everything we can away from that.

Co-authored-by: Matt Solomon <matt@mattsolomon.dev>
@mslipper
Copy link
Contributor Author

mslipper commented Sep 4, 2024

Next steps:

## Goals

- Provide a canonical way to deploy chains.
- Enable deploying multiple L2s to a single L1 as part of a single logical Superchain.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will op-deployer support incremental deployments (i.e., only deploy the minimal set of components that were modified as a result of an intent mutation)?

```

Only the minimum necessary state is stored. For example, the rollup config can be derived from deploy config
variables and therefore can be generated on-the-fly when it is needed. `op-deployer` will expose utility subcommands

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It may be nice to have a command that outputs all the configs and deploy artifacts exhaustively so the complete state can be readily examined / debugged as needed.

Comment on lines +65 to +69
An example deployment intent is below. Deployment intents are encoded as TOML to match what the OP Stack Manager
expects:

```toml
[intent]
Copy link

@jelias2 jelias2 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the intent struct, should the tooling aim to have overlap with the superchain registry config structure or be able to ingest that repo? Aligning formats between these two tooling could enhance usability and impact by allowing any operator to easily deploy another chain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deployer can output a deploy config, so it's compatible with the registry. Over time I expect us to refactor away from the deploy config, however, so at that time we'll need to refactor the SR as well.


- Provide a canonical way to deploy chains.
- Enable deploying multiple L2s to a single L1 as part of a single logical Superchain.
- Make it easier for teams like Protocol and Proofs to maintain their deployment tooling.
Copy link

@jelias2 jelias2 Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does op-deployer aim to deploy production level environments or just development tooling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially development tooling, but over time I'd like it to be used for prod environments as well.

@tynes
Copy link
Contributor

tynes commented Sep 16, 2024

Merging this as its been approved and discussion is calming down

@tynes tynes merged commit e3f2281 into main Sep 16, 2024
@tynes tynes deleted the feat/op-deployer branch September 16, 2024 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants