Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: Resend transactions using same mechanism as initial send #915

Merged
merged 4 commits into from
Apr 26, 2024

Conversation

joncinque
Copy link

@joncinque joncinque commented Apr 19, 2024

Problem

There are still issues with deploying programs. Even against a local validator, it takes multiple resends to get a program deployment through. For large amounts of transactions, using send_wire_transaction_batch seems to be flaky, especially against public networks.

Summary of Changes

This contains a couple of changes, and can be viewed commit by commit. The concept is to have the TPU client resend transactions at a pace of 100 transactions per second, same as the initial send, and to avoid using send_wire_transaction_batch.

And to make sends a bit smoother and retry sooner, all sends now timeout after 5 seconds, even the initial send. This allows the resends to start immediately after the initial send, so we can simplify the code and avoid concurrently sending and resending the same transactions. This doesn't totally resolve issues with deploys mainnet unfortunately, since just getting the transaction to the leader over TPU is nearly impossible, but it makes sends much more consistent against testnet.

I tested by alternating the old and new code to send 5,000 transactions on testnet. The transactions were unique self-transfers with 1 million micro-lamports per CU. Runs were cancelled after 5 minutes. Results:

  • without: 160s, (cancelled with 174 txs left), (cancelled with 48 txs left), 150s, (cancelled with 32 txs left)
  • with: 71s, 62s, 210s, 61s, 290s

Fixes #

Copy link

mergify bot commented Apr 19, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 35.41667% with 31 lines in your changes are missing coverage. Please review.

Project coverage is 81.8%. Comparing base (ee113de) to head (2454f70).
Report is 12 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master     #915     +/-   ##
=========================================
- Coverage    81.9%    81.8%   -0.1%     
=========================================
  Files         853      853             
  Lines      231847   231811     -36     
=========================================
- Hits       189909   189837     -72     
- Misses      41938    41974     +36     

@t-nelson
Copy link

i think the staggering thing is bandaging over TPU QUIC backoff being broken both server and client side. server side improvments should be shipping soon. not sure if there's been actual progress on client side?

@joncinque
Copy link
Author

i think the staggering thing is bandaging over TPU QUIC backoff being broken both server and client side. server side improvments should be shipping soon. not sure if there's been actual progress on client side?

That's all correct. I haven't had time yet, but I can put some time into client-side backoff

@joncinque joncinque merged commit ed11b72 into anza-xyz:master Apr 26, 2024
47 checks passed
@joncinque joncinque deleted the depretry branch April 26, 2024 11:03
mergify bot pushed a commit that referenced this pull request Apr 26, 2024
* client: Timeout tpu sends

* Resend transactions one at a time, same as the first send

* Simplify resends to happen right after initial sends

* Skip already processed transaction errors

(cherry picked from commit ed11b72)
joncinque added a commit that referenced this pull request Apr 26, 2024
* client: Timeout tpu sends

* Resend transactions one at a time, same as the first send

* Simplify resends to happen right after initial sends

* Skip already processed transaction errors

(cherry picked from commit ed11b72)
joncinque added a commit that referenced this pull request Apr 26, 2024
…nd (backport of #915) (#1072)

client: Resend transactions using same mechanism as initial send (#915)

* client: Timeout tpu sends

* Resend transactions one at a time, same as the first send

* Simplify resends to happen right after initial sends

* Skip already processed transaction errors

(cherry picked from commit ed11b72)

Co-authored-by: Jon C <me@jonc.dev>
anwayde pushed a commit to firedancer-io/agave that referenced this pull request Jul 23, 2024
…nd (backport of anza-xyz#915) (anza-xyz#1072)

client: Resend transactions using same mechanism as initial send (anza-xyz#915)

* client: Timeout tpu sends

* Resend transactions one at a time, same as the first send

* Simplify resends to happen right after initial sends

* Skip already processed transaction errors

(cherry picked from commit ed11b72)

Co-authored-by: Jon C <me@jonc.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants