-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap v2 #410
Roadmap v2 #410
Conversation
Roadmap Zombienet v2 | ||
|
||
## Infra | ||
- Chaos testing, add examples and explore possibilities in `native` and `podman` provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is top priority for parachains. We want to roll out a separate CI pipeline to do these long duration tests.
@@ -0,0 +1,35 @@ | |||
Roadmap Zombienet v2 | |||
|
|||
## Infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prometheus server deployment -> asserting against prometheus queries.
- we remove code and inefficiency from scraping metrics ourselves
- scales better
- we get a standardized way of querying metrics and ability to do aggregations as we see fit
- (nice to have) create some test reports using these, maybe alarms for long duration failure conditions
- local debugability increases
Open question:
- do we use prometheus for native/local runs, or just k8s ? (if this differers across providers, prometheus queries won't work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think local prometheus should be opt-in
and not a default.
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
- Create decorators registry and allow override by paras (wip) | ||
- Explore how to get info from paras. | ||
|
||
## Functional tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zombienet Test SDK (Rust) - it's not replacing the deployment which remains as is (typescript) + additional features listed above. Integration point is the output JSON file from deployment which is used to spawn the Zombienet Test SDK test environment.
Advantages of this approach:
- one language to fully write the entire test (deployment part + actual tests to be performed)
- simple APIs to generate the
Network
file - builder pattern - simple APIs to query/assert metrics
- write more complex test logic
- directly use or wrap subxt - open HRMP channels, send XCM, do runtime upgrades, etc
- opens up the Rust ecosystem of libraries that can be used in a test
- feature parity with current DSL support (phasing out DSL over time after we have same featureset)
- easier for people outside Parity to contribute to the project
@@ -0,0 +1,35 @@ | |||
Roadmap Zombienet v2 | |||
|
|||
## Infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long lived test networks deployment and management.
- deploy and manage Versi using Zombienet - similar functionality with validator manager
- obsoletes our custom solution currently built and maintained by devops
@@ -0,0 +1,35 @@ | |||
Roadmap Zombienet v2 | |||
|
|||
## Infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deployment scalability improvements - up to 1000 validators.
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
## Infra | ||
- Chaos testing, add examples and explore possibilities in `native` and `podman` provider | ||
- Add `docker` provider | ||
- Add `nomad` provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @drahnr, did you have time to make a quick chat to review some request relates to nomad?
Thanks!
|
||
## Functional tasks | ||
- Add subxt integration, allow to compile/run on the fly | ||
- Move parser to pest (wip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ happy to collab on this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Nice to see the roadmap for v2. I'm really happy that the DSL is on its way out, I was never a fan of it. I read @sandreim comment about using Rust and while I understand his intention, I would say we should use typescript. Using typescript doesn't mean we will never have a Rust integration. However, the most important point here is time and resources (humans). IMO we should be able to move forward as fast as possible into different directions of usage with zombienet. For this, I think the current DSL is the main blocker. Using a raw typescript interface will give us access to the entire universe of polkadot js and everything that is build around it. We would get directly access to all the tools to send transaction and interact the node in all the required ways. For any other kind of interaction, like checking the status of metrics there probably already exists a package or we could write some thin layer around zombienet. My ideal way of interacting with zombienet would be as similar as in this old Rust integration test in Cumulus: https://github.com/paritytech/cumulus/blob/master/client/pov-recovery/tests/pov_recovery.rs So, there would exist ONE file per test that is defines the test. No extra file to define the topology or whatever, I already have seen on how you hacked iterations into the topology file. Then reading that you want to create an UI for creating these files, IMO that is too much for now and not what we need. Our main audience are developers that writing blockchains, these people feel comfortable in text files and not in UIs ;) For spawning nodes there could be a thin interface being provided by zombienet to do common things. See the following as some sort of pseudo code (because I didn't yet learn TS, but I would do it for zombienet :P):
Something like this. I get that this is very rough and we would also need a way that the specification of I hope it is clear on what I have written 😅 And thank you for all the hard work ❤️ |
Hi @bkchr, thanks for your feedback!! It's awesome :) I really like the approach of having only We plan to have a session in the retreat about the new SDK to present our high-level design and collect feedback from others teams in order to build the flexibility needed by the different use cases. Again, thanks you very much for your feedback 🙌 🙌 |
This pull request has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/chopsticks-substrate-testing-client/878/10 |
- Add more CLI subcommands | ||
- Add js/subxt snippets ready to use in assertions (e.g transfers) | ||
- Add XCM support in built-in assertions | ||
- Add support to start from a live network (fork-off) [check subalfred] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From our experience (already doing fork-off Moonbeam chains):
This will require chain-specific actions (changing the authoring values, changing council, changing balance....). So a proper way to interact with the raw state is needed for each case.
It also requires to deal with very large database (or exported state json file) (Moonbeam > 5GB), which is not often easily done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should perhaps file sub-issues to all roadmap items for further discussion once finalized.
Certainly, this is a desirable feature. The main motivation I see is to test node or runtime changes against live network states as a pre-release check. This does require some ability to interact with raw state. To start from a live network state, the two major things are to allow swapping out the runtime (optional) and to allow modification of the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should perhaps file sub-issues to all roadmap items for further discussion once finalized.
Certainly, this is a desirable feature. The main motivation I see is to test node or runtime changes against live network states as a pre-release check. This does require some ability to interact with raw state. To start from a live network state, the two major things are to allow swapping out the runtime (optional) and to allow modification of the state.
Thanks for your feedback @rphmeier! I will create a couple of umbrella issues to capture the vision/plan for the public roadmap and individual issues for items.
Thanks!
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Co-authored-by: Nikos Kontakis <wirednkod@gmail.com>
Hi All, Really happy that you are already planning a roadmap to help onboard new devs, that's great! I would love to give some feedback if possible. For a new dev joining the ecosystem, they already have a really steep learning curve, to name a few things like learning Rust and substrate, understanding the dependencies between substrate/Polkadot/Cumulus and adopting core blockchain-based primitives, that is a lot! I just want to spin up a chain and send some transactions, why do I need/have to learn K8? 😕 So as someone new, I would like a really simple way to spin up the entire infrastructure. Ideally, run one line if possible. That is the sort of user experience we should aim for. As great as Zombienet is right now, there are alot of prerequisites you have to install just to get a local chain running. Ironically it was much simpler to run using polkadot-launch despite the fact that you still had to have the binaries installed locally. We already have the binaries precompiled as images on docker hub so I would argue that the simplest example we can provide is just to use I recently submitted a pr to cumulus to get everything(relay +chain) in a docker-compose file but @bkchr kindly pointed me to zombienet I see from the roadmap that you are already planning to add a docker provider but why not have a compose file as an in-between step until the provider is built for example, what if the image is just zombienet which has all the required dependencies installed and then just runs an example file to spin up everything. That way you reduce the number of steps needed to get a network up while at the same time give new devs a chance to understand what exactly is happening without going through the time-consuming process of learning how to set up zombienet? Let me know if this feedback makes sense, I am also happy contributing in any way |
Besides the stuff I proposed above and we talked about IRL, we may should have some kind of "adhoc" mode as @samelamin proposed it above. This adhoc mode could be using some kind of different format for the node declaration file. Or we use the same type script syntax for spawning the nodes, but don't stop the processes when the script ends and keep everything running until zombienet is killed. This could then be used for this kind of adhoc testing. |
About fork:
Can do
So I can have just a separate step to download both. And then some docs how to configure ZN? Right. Should some fixes to ZN done to handle it right genesis uses? Would relay genesis hash swapped to rococo local in provided file? For fork, I would prefer separate command to download data and then run ZN on it. Why? Because our CRON can download it and put into some easy HTTP GET place, from which we can grab it and run NIX based tests. |
I will close this one since the roadmap is now public and the items (or adding new ones) can be discussed there. |
Brainstorming on tasks for
v2
roadmap.