Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async in Zebra RFC #1965

Merged
merged 34 commits into from
May 4, 2021
Merged

Async in Zebra RFC #1965

merged 34 commits into from
May 4, 2021

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Mar 30, 2021

TODO

Summary

Zebra executes concurrent tasks using async Rust, with the tokio executor.

At a higher level, Zebra also uses tower::Services, tower::Buffers, and our own tower-batch implementation.

Zebra programmers need to carefully write async code so it doesn't deadlock or hang. This is particularly important for poll, select, Buffer, and Batch.

More information

Document

Rendered

Zebra Team Approval

Everyone on the Zebra team should review design RFCs:

If no-one has time to review by 2021-04-09, let's merge this RFC as a draft.

Related Tickets

Closes #1593.

@teor2345 teor2345 added A-docs Area: Documentation C-design Category: Software design work S-needs-triage Status: A bug report needs triage NU-5 Network Upgrade: NU5 specific tasks P-Low labels Mar 30, 2021
@teor2345 teor2345 added this to the 2021 Sprint 6 milestone Mar 30, 2021
@teor2345 teor2345 requested a review from a team March 30, 2021 08:43
@teor2345 teor2345 self-assigned this Mar 30, 2021
@teor2345 teor2345 modified the milestones: 2021 Sprint 6, 2021 Sprint 7 Mar 31, 2021
Copy link
Contributor

@oxarbitrage oxarbitrage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, i made a few suggestions as i will like to see the RFC as a reference guide where the developer/reviewer can refer to specific sections where the rules are set.

book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
book/src/dev/rfcs/drafts/xxxx-async-rust.md Outdated Show resolved Hide resolved
@teor2345 teor2345 requested a review from dconnolly April 7, 2021 09:36
@teor2345
Copy link
Contributor Author

teor2345 commented Apr 7, 2021

I think we're ready for a final review now.

@mpguerra mpguerra modified the milestones: 2021 Sprint 7, 2021 Sprint 6 Apr 7, 2021
@teor2345 teor2345 marked this pull request as ready for review April 30, 2021 02:09
@teor2345
Copy link
Contributor Author

This RFC is now ready for another round of review.

Here's what I changed since the last reviews:

  • added atomics
  • added futures-aware types
  • rewrote the acquiring locks section
  • did a rebase so the build jobs will succeed

Here are the changes since @oxarbitrage and @dconnolly last reviewed:
https://github.com/ZcashFoundation/zebra/pull/1965/files/2ce751c9dbd7f3dcc3742e6b812fb22b4924e6d5..154966e5071ad98c10481b69df6c532f14adbdac

@teor2345 teor2345 added P-Medium and removed P-Low S-needs-triage Status: A bug report needs triage labels Apr 30, 2021
@teor2345 teor2345 enabled auto-merge (squash) April 30, 2021 02:15
Comment on lines 130 to 133
Zebra's [`Inbound service`](https://github.com/ZcashFoundation/zebra/blob/0203d1475a95e90eb6fd7c4101caa26aeddece5b/zebrad/src/components/inbound.rs#L238)
can't use an async-aware mutex for its `AddressBook`, because the mutex is shared
with non-async code. It only holds the mutex to clone the address book, reducing
the amount of time that other tasks on its thread are blocked:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

@oxarbitrage oxarbitrage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename xxxx-async-rust.md to 0011-async-rust.md and move out from drafts folder.

Rest looks good.

@teor2345
Copy link
Contributor Author

teor2345 commented May 3, 2021

@conradoplg you might also want to read the examples in this RFC. It should help you write or troubleshoot async Rust.

@teor2345
Copy link
Contributor Author

teor2345 commented May 3, 2021

Feel free to ask questions, and let me know if anything is unclear.

@teor2345 teor2345 disabled auto-merge May 3, 2021 23:05
@teor2345 teor2345 enabled auto-merge (squash) May 3, 2021 23:06
@teor2345 teor2345 requested a review from oxarbitrage May 3, 2021 23:07
@teor2345
Copy link
Contributor Author

teor2345 commented May 3, 2021

@oxarbitrage I moved and renamed the file, this PR will auto-squash and merge once you approve it.

Comment on lines 194 to 195
If the lock isn't dropped, compilation fails, because the mutex lock can't be
sent between threads.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

If you are unable to use futures-aware types:
- block the thread for as short a time as possible
- document the correctness of each blocking call
- consider re-designing the code to use `tower::Services`, or other futures-aware types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Comment on lines 516 to 517
- make critical sections as short as possible, and
- do not depend on other tasks or locks inside the critical section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 632 to 635
In Zebra, we try to use safe abstractions, and write obviously correct code. It takes
a lot of effort to write, test, and maintain low-level code. Almost all of our
performance-critical code is in cryptographic libraries. And our biggest performance
gains from those libraries come from async batch cryptography.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💖

@@ -505,3 +679,6 @@ those bugs can take a lot of developer effort.
Can we catch these bugs using automated tests?

How can we diagnose these kinds of issues faster and more reliably?
- [TurboWish](https://blog.pnkfx.org/blog/2021/04/26/road-to-turbowish-part-1-goals/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@dconnolly dconnolly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@teor2345 teor2345 merged commit 6f2dbd9 into main May 4, 2021
@teor2345 teor2345 deleted the async-rfc branch May 4, 2021 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation C-design Category: Software design work NU-5 Network Upgrade: NU5 specific tasks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make sure Zebra uses poll_ready and Buffer reservations correctly
4 participants