Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix - Reloading of unload entries after finishing the recompilation phase #1199

Merged

Conversation

Lichtso
Copy link

@Lichtso Lichtso commented May 6, 2024

Problem

Currently the epoch boundary recompilation phase does adjust the effective slot of recompiled entries to be the first slot of the next epoch. However, when reloading unloaded entries, their new loaded entry has an unadjusted effective slot, which will not be matched in assign_program().

Summary of Changes

  • Adds a reproducer for the condition.
  • Differentiates the two versions of an entry (for previous and upcoming feature set) by get_environment() instead of adjusting the effective_slot.
  • Changes the tests of assign_program() to always use the unifed get_mock_env() instead of creating random environments with Arc::new(BuiltinProgram::new_mock()).
  • Swaps the order of environments in test_feature_activation_loaded_programs_recompilation_phase().
  • Turns test_load_program_effective_slot() into test_load_program_environment().

) || !Arc::ptr_eq(
&environments.program_runtime_v2,
&program_cache.environments.program_runtime_v2,
) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch was only taken during a feature set change which affects RBPF. Thus, removing it has no affect on the normal operation outside the epoch boundary recompilation phase.

fn is_current_env(
environments: &ProgramRuntimeEnvironments,
env_opt: Option<&ProgramRuntimeEnvironment>,
) -> bool {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function always returns true during normal operation outside the epoch boundary recompilation phase. That is because all entries which have an environment have the current environment and all others default to true directly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be good to add the above as a comment in the code. It's a subtle condition, which will hard to understand a few months from now.

let slot_versions = &mut self.entries.entry(key).or_default().slot_versions;
match slot_versions.binary_search_by(|at| {
at.effective_slot
.cmp(&entry.effective_slot)
.then(at.deployment_slot.cmp(&entry.deployment_slot))
.then(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This additional then() has no affect on the normal operation outside the epoch boundary recompilation phase. is_current_env() will return true, so this evaluates to .then(true.cmp(&true)) which is .then(std::cmp::Ordering::Equal) which is the same as no .then() statement as it was before.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, need some comment in the code that explains this.

@Lichtso Lichtso added the v1.18 label May 6, 2024
Copy link

mergify bot commented May 6, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

@Lichtso Lichtso force-pushed the fix/unload_reload_after_recompilation_phase branch from 1ffc298 to 407cb83 Compare May 6, 2024 14:55
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.1%. Comparing base (b0cfffb) to head (407cb83).
Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1199   +/-   ##
=======================================
  Coverage    82.1%    82.1%           
=======================================
  Files         886      886           
  Lines      236372   236391   +19     
=======================================
+ Hits       194204   194246   +42     
+ Misses      42168    42145   -23     

@Lichtso Lichtso force-pushed the fix/unload_reload_after_recompilation_phase branch from 407cb83 to a810db2 Compare May 6, 2024 17:29
@Lichtso Lichtso requested a review from pgarg66 May 6, 2024 21:55
@pgarg66 pgarg66 requested a review from LucasSte May 7, 2024 02:32
@Lichtso Lichtso merged commit 979f619 into anza-xyz:master May 9, 2024
38 checks passed
@Lichtso Lichtso deleted the fix/unload_reload_after_recompilation_phase branch May 9, 2024 20:15
mergify bot pushed a commit that referenced this pull request May 9, 2024
…hase (#1199)

* Adds reproducer to test_feature_activation_loaded_programs_epoch_transition.

* Differentiate entries by environment instead of adjusting the effective slot.

* Fixes tests of assign_program().

* Fixes env order in test_feature_activation_loaded_programs_recompilation_phase().

* Turns test_load_program_effective_slot() into test_load_program_environment().

* Adds comments inside ProgramCache::assign_program().

(cherry picked from commit 979f619)

# Conflicts:
#	program-runtime/src/loaded_programs.rs
#	runtime/src/bank/tests.rs
#	svm/src/program_loader.rs
Lichtso added a commit that referenced this pull request May 9, 2024
…hase (#1199)

* Adds reproducer to test_feature_activation_loaded_programs_epoch_transition.

* Differentiate entries by environment instead of adjusting the effective slot.

* Fixes env order in test_feature_activation_loaded_programs_recompilation_phase().

* Adds comments inside ProgramCache::assign_program().

(cherry picked from commit 979f619)
Lichtso added a commit that referenced this pull request May 9, 2024
…hase (#1199)

* Differentiate entries by environment instead of adjusting the effective slot.

* Adds comments inside ProgramCache::assign_program().

(cherry picked from commit 979f619)
Lichtso added a commit that referenced this pull request May 13, 2024
…ation phase (backport of #1199) (#1275)

* Fix - Reloading of unload entries after finishing the recompilation phase (#1199)

* Differentiate entries by environment instead of adjusting the effective slot.

* Adds comments inside ProgramCache::assign_program().

* Adds reproducer to test_feature_activation_loaded_programs_epoch_transition.

* Fixes env order in test_feature_activation_loaded_programs_recompilation_phase().

* Syncs both tests with master.

---------

(cherry picked from commit 979f619)

Co-authored-by: Alexander Meißner <AlexanderMeissner@gmx.net>
anwayde pushed a commit to firedancer-io/agave that referenced this pull request Jul 23, 2024
…ation phase (backport of anza-xyz#1199) (anza-xyz#1275)

* Fix - Reloading of unload entries after finishing the recompilation phase (anza-xyz#1199)

* Differentiate entries by environment instead of adjusting the effective slot.

* Adds comments inside ProgramCache::assign_program().

* Adds reproducer to test_feature_activation_loaded_programs_epoch_transition.

* Fixes env order in test_feature_activation_loaded_programs_recompilation_phase().

* Syncs both tests with master.

---------

(cherry picked from commit 979f619)

Co-authored-by: Alexander Meißner <AlexanderMeissner@gmx.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants