Eng 766. List bottom up checkpoint status. #748

cryptoAtwill · 2024-02-26T13:51:10Z

Closes ENG-766.

Adding a new CLI method that lists the checkpoint status in bottom up checkpoint submission.

This change is

linear · 2024-02-26T13:51:13Z

ENG-766 ipc-cli checkpoint status command

ipc/cli/src/commands/checkpoint/status.rs

aakoshh · 2024-02-27T10:32:27Z

contracts/src/subnet/SubnetActorCheckpointingFacet.sol

-            // validate signatures and quorum threshold, revert if validation fails
-            validateActiveQuorumSignatures({signatories: signatories, hash: checkpointHash, signatures: signatures});
+        if (s.committedCheckpoints[checkpoint.blockHeight].blockHeight != 0) {
+            revert BottomUpCheckpointAlreadySubmitted();


I think the reason this wasn't an error condition is because we wanted to allow multiple relayers to submit the checkpoints and add them to pool of relayers to be rewarded. The difference was that the first relayer caused the checkpoint to be executed, while subsequent ones only went through validation of the payload, and being registered for rewards.

I don't really have a pro or contra argument against turning this into an error. Adding rewards will require migration, and undoing this change would require a migration too, but if we need migration anyway then there is little reason to keep accepting duplicate submissions.

The thing is, we dont have reward now, so I'm not sure if we still want to do that.

Yeah, that's what I mean, when we'll have rewards, this will need a migration even if you leave it alone, so I don't have an opinion on error vs noop.

Noop for sure will not kill so many tests

aakoshh · 2024-02-27T10:39:15Z

ipc/provider/src/manager/evm/manager.rs

@@ -1076,6 +1076,37 @@ impl BottomUpCheckpointRelayer for EthSubnetManager {

        Ok(events)
    }
+
+    async fn max_quorum_reached_height(


It would be helpful to explain what this method is trying to do.

aakoshh · 2024-02-27T10:49:32Z

ipc/provider/src/manager/evm/manager.rs

+        while l < r {
+            let mid = (l + r) / 2;
+
+            // locate the epoch where checkpoints should have been created
+            let mid = mid / period * period;
+
+            if self.quorum_reached_events(mid).await?.is_empty() {
+                r = mid - 1;
+            } else {
+                maybe_height = Some(mid);
+                l = mid + 1;
+            }
+        }
+
+        Ok(maybe_height)


It may be worth factoring out the bisection algorithm and make some unit tests, because it's not a standard one and looks tricky. Not sure what are the invariants it's trying to adhere to.

I may be wrong but it looks to me like it could get into an infinite loop:

Say our check period is 10, and we call this method with from = 53 and to = 62.

mid = (53+62)/2 = 57

mid = mid / 10 * 10 -> I'm not sure what this should do; mathematically it should be 57, but it might give 50 due to truncation. Let's say after this mid = 50; that's outside from and to. Is that okay for the method to return?

Say quorum_reached_events(50) is not empty. maybe_height = 50, and from = 51.

Goto 2. mid = (51+62)/2 = 56, and we'll enter another loop with mid = 50.

Say quorum_reached_events(50) is empty. in this case to = 49 and we exit with an empty result. Maybe there would have been a result at 60?

I'm also not sure what why we're looking for quorum at fixed multiples of the period? Signatures are submitted asynchronously, they can gather at any height. I'm not even sure why we'd be looking for checkpoints added to the ledger at fixed multiples of the period, since batching could trigger a checkpoint at any height.

yeah, there is a max-pending flag now, I dont think this would be needed anymore. Might be difficult to add tests for this one too.

raulk · 2024-03-08T06:34:03Z

@cryptoAtwill please add a description and title here

jsoares · 2024-03-13T17:23:00Z

This looks close enough to the finish line and it's just non-critical cli code. @aakoshh's comments are all on outdated code, though still open. @aakoshh can we get a re-review to unblock?

raulk

This looks good, but we can make the output clearer, and our lives easier by using more specific names for variables.

ipc/cli/src/commands/checkpoint/status.rs

raulk · 2024-03-15T11:16:52Z

ipc/cli/src/commands/checkpoint/status.rs

+        let ending = max_pending as ChainEpoch * period + start;
+        let mut checkpoints_ahead = 0;
+        for h in start..=ending {
+            let c = provider.get_bottom_up_bundle(&subnet, h).await?;


Wouldn't it be cleaner to make this return an Option?

this is handled in previous PR: #743

So you query the whole bundle at every height to see if there is a potential checkpoint, right? Can't this be queried with the events API?

There is indeed a PR that I changed the query from height to height to a range query, but somehow that PR is not merged. I can recreate that on a follow up.

Co-authored-by: raulk <raul@protocol.ai>

aakoshh · 2024-03-22T13:39:11Z

ipc/cli/src/commands/checkpoint/status.rs

+        let start = last_checkpointed_height + 1;
+        let ending = limit_unsubmitted as ChainEpoch * period + start;


It would be nice to explain what's happening.

For example I don't know if last_checkpointed_height is a) height in the subnet of the last created checkpoint or b) the last submitted checkpoint in the parent.

aakoshh · 2024-03-22T13:43:27Z

ipc/cli/src/commands/checkpoint/status.rs

+    pub subnet: String,
+    #[arg(
+        long,
+        help = "Limit unsubmitted checkpoints to print (looking forward from last submitted), default: 10"


But it doesn't necessarily limit the number of checkpoints to print, only the number of checkpoint periods to look ahead.

I also find it confusing that the title and the docs say that it "lists checkpoint statuses", but I don't see any listing.

basically just display a bunch of metrics/fields related to bottom up checkpointing.

cryptoAtwill added 3 commits February 26, 2024 16:40

add more info into error log

d33c46e

block checkpoint resubmission

940d90f

checkpoint status

14552d5

cryptoAtwill added 2 commits February 27, 2024 00:15

add latest quorum reached height

1cb48d8

change pending checkpoint flag

57996a9

aakoshh reviewed Feb 27, 2024

View reviewed changes

ipc/cli/src/commands/checkpoint/status.rs Outdated Show resolved Hide resolved

aakoshh reviewed Feb 27, 2024

View reviewed changes

cryptoAtwill added 2 commits February 27, 2024 20:34

remove unused code

c2492fb

revert submission

500535e

cryptoAtwill requested a review from aakoshh February 28, 2024 08:00

cryptoAtwill changed the title ~~Eng 766~~ Eng 766. List bottom up checkpoint status. Mar 12, 2024

raulk reviewed Mar 15, 2024

View reviewed changes

cryptoAtwill and others added 10 commits March 22, 2024 15:22

Update ipc/cli/src/commands/checkpoint/status.rs

24ca1b5

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

66ed3c5

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

5c1b797

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

06e9337

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

59c2685

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

e9861e3

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

0b9b2c4

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

7a6ae82

Co-authored-by: raulk <raul@protocol.ai>

Update ipc/cli/src/commands/checkpoint/status.rs

f330ab7

Co-authored-by: raulk <raul@protocol.ai>

merge with main

d953d59

cryptoAtwill requested a review from raulk March 22, 2024 08:09

cryptoAtwill changed the base branch from ENG-763 to main March 22, 2024 08:10

aakoshh reviewed Mar 22, 2024

View reviewed changes

Merge branch 'main' into ENG-766

b61682c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eng 766. List bottom up checkpoint status. #748

Eng 766. List bottom up checkpoint status. #748

cryptoAtwill commented Feb 26, 2024 •

edited by raulk

Loading

linear bot commented Feb 26, 2024

aakoshh Feb 27, 2024

aakoshh Feb 27, 2024

cryptoAtwill Feb 27, 2024

aakoshh Feb 27, 2024

cryptoAtwill Feb 27, 2024

aakoshh Feb 27, 2024

aakoshh Feb 27, 2024 •

edited

Loading

cryptoAtwill Feb 27, 2024

raulk commented Mar 8, 2024

jsoares commented Mar 13, 2024 •

edited

Loading

raulk left a comment

raulk Mar 15, 2024

cryptoAtwill Mar 22, 2024 •

edited

Loading

aakoshh Mar 22, 2024

cryptoAtwill Mar 22, 2024

aakoshh Mar 22, 2024

aakoshh Mar 22, 2024

aakoshh Mar 22, 2024

cryptoAtwill Apr 23, 2024

		let start = last_checkpointed_height + 1;
		let ending = limit_unsubmitted as ChainEpoch * period + start;

Eng 766. List bottom up checkpoint status. #748

Are you sure you want to change the base?

Eng 766. List bottom up checkpoint status. #748

Conversation

cryptoAtwill commented Feb 26, 2024 • edited by raulk Loading

linear bot commented Feb 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aakoshh Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk commented Mar 8, 2024

jsoares commented Mar 13, 2024 • edited Loading

raulk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoAtwill Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cryptoAtwill commented Feb 26, 2024 •

edited by raulk

Loading

aakoshh Feb 27, 2024 •

edited

Loading

jsoares commented Mar 13, 2024 •

edited

Loading

cryptoAtwill Mar 22, 2024 •

edited

Loading