Implement off-chain Window PoSt verification #1327

Stebalien · 2020-12-10T23:42:10Z

Proposal: filecoin-project/FIPs#42

TODO

Changes

In addition to the changes below,

Change: SubmitWindowedPoSt

Proofs that do not recover faulty power are not checked on-chain. Instead,

The proof is recorded in an AMT of "optimistic" proofs.
The proof is not verified.
The proof may be "disputed" by anyone for 1800 epochs after the challenge window ends.

As usual, skipped sectors are marked as faulty and "unproven" sectors are marked as "proven".

Notes:

If a proof recovers faulty power, it is checked on-chain immediately to prevent "recovering" faulty sectors with invalid proofs.
"Unproven" sectors (sectors newly added to a partition but never proven with a window PoSt) do not force an on-chain proof verification. This is intentional.

Change: Deadline End

As usual, partitions without proofs are marked as faulty, "new" (unproven) sectors are activated if proofs for them are present, etc.

Immediately after processing all proof related logic, and before terminating any sectors:

We record a snapshot of the partition AMT so proofs can be properly disputed later (PartitionsSnapshot)
We record a snapshot of the proofs AMT (ProofsSnapshot).

We then proceed to terminate sectors as usual.

Change: CompactPartitions

As CompactPartitions can move sectors between partitions, it makes it difficult mark these sectors as faulty when disputed. While the disputer could submit bitfields describing the new locations of the sectors in question, this dispute could be defeated by repeatedly re-compacting.

To simplify the prevent this attack and simplify the disputer's job, partition compaction is now prohibited for WPoStProofDisputePeriod (currently 2x finality or 15 deadlines) epochs after the deadline's last challenge window. This leaves a window of 960 epochs (8 hours) for partition compaction.

Proofs may not be disputed after this time has passed.

Questions

Is the Finality blackout correct? We can make it longer (but probably not shorter). 2x finality would be safer.
- I've made it 2x finality which should be much safer.

New: DisputeWindowedPoSt

This patch introduces a new method: DisputeWindowedPoSt. This method selects a proof from the ProofsSnapshot AMT and verifies it.

The epoch and deadline are checked to make sure that the window for disputing proofs is still open.
The proof is verified against the partition state snapshot from the end of the last challenge window. This represents the state that was (supposedly) proven.
If the proof validates, this method aborts with no change. Otherwise, we continue.
The active sectors in the partition snapshot are declared as faulty. Terminated/moved sectors are left alone. The faulty termination epoch is set as if the sectors became faulty at the end of the previous challenge window for the deadline in question.
The target miner looses power based on the sectors marked faulty in step 3. If a sector moves, the power will be retained.
The disputed proof is removed from the proofs AMT so it can't be disputed twice.
The target miner is fined a penalty based on the active power in the partitions that were incorrectly prove. This "active power" is taken from the partition snapshot.

Questions

Should we allow the miner to specify a sector/partition mapping to work around the case where a sector can be moved due to compaction? I'm concerned that this could lead to large, expensive disputes.
- We're limiting disputes to the a dispute window.
What should the penalty be? (to be determined by FIP)
What should the reporter be paid, if anything?
Do we really need to snapshot the sectors AMT? I was concerned about compaction deleting sectors. However, we could also just make the window in which miners can compact a deadline disjoint from the window where proofs can be disputed. ~~This might require extending the dispute window a bit (e.g., 2x finality)~~ (done).
- We're limiting disputes to a dispute window.

Stebalien · 2020-12-10T23:52:31Z

It looks like this is causing us to kill (OOM?) the migration test. Not sure what that's about.

Stebalien · 2020-12-11T18:56:56Z

Turns out "immutable" deadlines aren't quite so immutable because we still allow sector termination. This is moderately annoying to fix because we can terminate sectors from cron (e.g., when terminating too many). We can fix the issue for all practical purposes, but getting a fully correct fix could be annoying.

A simple fix is to snapshot the partitions/sectors along with the proofs, instead of at the end of the deadline. Honestly, this is the "better" approach in terms of correctness. Unfortunately, while it will only marginally increase gas costs (proofs need 2 extra CIDs) it'll increase the size of the state tree by quite a bit. On the other hand, I'm not sure how much that matters.

One of the significant upsides here is that the "WindowPoSt" objects are now completely self-contained.

Thoughts?

type WindowedPoSt struct {
    ProvenPartitions bitfield.BitField
    PartitionsSnapshot cid.Cid // new
    SectorsSnapshot cid.Cid // new
    Proofs []proofs.PoStProof
}

Stebalien · 2020-12-11T22:44:46Z

Ah, wait. I'm confusing myself.

We do two things:

Actually terminate sectors.
Process early terminations.

1 is triggered from TerminateSectors and at deadline end. This is the part that actually marks sectors as dead.

2 can be triggered from cron, but just pays fines.

TL;DR: all we need to do is block manual calls to TerminateSectors during the "blackout" period. That's pretty reasonable.

Stebalien · 2020-12-11T22:53:40Z

TL;DR: all we need to do is block manual calls to TerminateSectors during the "blackout" period. That's pretty reasonable.

Done. This also fixes the incentive mismatch where a logical miner will wait until missing data is first challenged to terminate a sector.

Stebalien · 2020-12-12T00:08:39Z

Next step: migration.

This should be fairly straight-forward.

If the PoSt is submitted before the upgrade, it'll be verified and the partitions will be marked proven.
If the PoSt is submitted after the upgrade, it'll be stored.

The tricky part is that the "all proven sectors have proofs in the proofs AMT" invariant won't be true during this period. I was hoping we could use this as a global invariant but life just isn't that simple.

Stebalien · 2020-12-12T00:45:00Z

Hm. Looks like the migration isn't fully implemented yet anyways. @anorth, should I pick that up? Or just move on for now.

anorth · 2020-12-13T22:21:10Z

Re compaction and finality, could we simplify to get these windows completely non-overlapping? E.g. challenge window of 18hrs, compaction window of 6 hrs? It would be more robust of course to allow complete overlapping, but it doesn't look like that can be done without being quite complex. So if we can't do that, I think it might be worth removing the possibility of collision between those two actions ("sectors that have moved partitions will not be marked faulty, the miner will still be fined"). Edit: I see you mentioned that idea later.

Re migration, correct that's a work in progress. It's probably better not to attempt it in this PR, but note for follow-up.

anorth

Thanks, this is looking good.

I trust that all the TODOs will either be resolved here, or have an issue filed, or be rewritten as commentary about alternative possibilities, before merging.

actors/builtin/miner/deadline_state.go

actors/builtin/miner/deadlines.go

actors/builtin/miner/miner_actor.go

gen/gen.go

Stebalien · 2020-12-14T20:11:27Z

Re compaction and finality, could we simplify to get these windows completely non-overlapping? E.g. challenge window of 18hrs, compaction window of 6 hrs? It would be more robust of course to allow complete overlapping, but it doesn't look like that can be done without being quite complex. So if we can't do that, I think it might be worth removing the possibility of collision between those two actions ("sectors that have moved partitions will not be marked faulty, the miner will still be fined"). Edit: I see you mentioned that idea later.

Yes, we can do this. And it's probably the simplest approach (and lets us not snapshot sectors).

actors/builtin/miner/miner_actor.go

actors/builtin/miner/deadline_state.go

wadealexc · 2020-12-16T21:35:41Z

actors/builtin/miner/miner_state.go

 	if deadline.LiveSectors == 0 {
+		// TODO: do we still need to clear the post submissions here? I
+		// think we're technically fine, but this is a strange
+		// edge-case.


I think yes, they should be cleared.

It's a strange edge case that can be directly reached by a miner, simply by terminating a deadline's sectors. Unclear how this may interact with ChallengeWindowedPoSt, but I don't think we want the old snapshots to stick around!

So, I don't think this can happen anymore (after this patch). Given this patch:

I can't "prove" no sectors. We reject those proofs.

I can't terminate sectors during the challenge window (or the previous deadline).

Given this, I shouldn't need to touch this array, right?

I think it should be done for safety, regardless. If the snapshots aren't reset each EOD cron, they may be technically valid when submitted to ChallengeWindowedPoSt

I guess I can just check without spending too much time. That, and log an error.

Re-organize proof recording logic to move it inside RecordProvenSectors. This means we can test it in the deadline invariant check.

Currently, the payout is zero. However, the logic now exists and we can discuss the ramifications.

Stebalien · 2021-01-11T23:39:28Z

actors/builtin/miner/miner_actor.go

+			// take reward out of the penalty as the miner
+			// could end up receiving a substantial
+			// portion of their fee back as a reward.
+			penaltyTarget := big.Add(penaltyBase, rewardTarget)


This is not how consensus faults work, but I believe it's the "more correct" approach.

Stebalien · 2021-01-11T23:41:00Z

actors/builtin/miner/miner_actor.go

+
+	requestUpdatePower(rt, powerDelta)
+
+	if !toReward.IsZero() {


This check isn't present in consensus faults reports and we may want to change that.

Stebalien · 2021-01-11T23:42:01Z

actors/builtin/miner/miner_actor.go

+		// If we fail, log and burn the reward to make sure the balances remain correct.
+		if !code.IsSuccess() {
+			rt.Log(rtt.ERROR, "failed to send reward")
+			toBurn = big.Add(toBurn, toReward)


This "burn it" isn't present in consensus fault reporting, but it's necessary to ensure that pledge values remain consistent.

Alternatively, we could just fail the message. Honestly, not failing the message is a bit of a footgun as lotus will think "the message succeeded" and will just send it to the network instead of kicking it back to the user to ensure the user gets paid.

I'd rather the report succeeded even if we can't pay the reward. Burning it sounds good.

For something like this, I assume the submitter would just try again. Any reason we need it to succeed the first time?

Why would it succeed the second time? I don't know what might have gone wrong to cause this failure (low balance?) but it is likely to be permanent. I don't think we should allow inability to pay the reward to stop the fraud being reported at all.

Hm. I was thinking something like the actor that sent the message suddenly got deleted (not sure how this can happen...).

But this isn't going to fix the "low balance" issue. If the actor doesn't have enough funds for the reward, they'll fail on the next step (burning).

Stebalien · 2021-01-11T23:44:00Z

In SubmitWindowedPoSt, is params.Proofs[0] checked for a maximum length? If not, it should be.

This is done.

Stebalien · 2021-01-11T23:59:33Z

My idea is to replace the current DisputeWindowedPoStParams with the following:

Discussed out of band: We can do this, but it would require changing maximum bitfield sizes. They're currently sized to fit up to a partition's worth of sector numbers worst-case, not more. This is not a hard change, it's just a bit tricky to make sure we don't mess it up somewhere.

I've also filed a proposal for the next upgrade to make this impossible: #1352

TL;DR: I'm going to leave things as they are unless we have a compelling reason to do otherwise.

These need to be checked up-front.

actors/builtin/miner/deadline_state.go

anorth · 2021-01-12T08:52:48Z

actors/builtin/miner/deadline_state.go

+		// NOTE: This also includes power that was
+		// activated at the end of the last challenge
+		// window, and power from sectors that have since
+		// expired.


IMO these comments belong in the data structure, more discoverable to callers.

Where? I could put it on top of the ParitionsSnapshot field, but it seems kind of obvious that the partition snapshot includes power that's no longer active. I put it here because the fact that we're "disputing" no longer active power is interesting.

anorth · 2021-01-12T09:02:39Z

actors/builtin/miner/miner_actor.go

+		// If we fail, log and burn the reward to make sure the balances remain correct.
+		if !code.IsSuccess() {
+			rt.Log(rtt.ERROR, "failed to send reward")
+			toBurn = big.Add(toBurn, toReward)


I'd rather the report succeeded even if we can't pay the reward. Burning it sounds good.

actors/builtin/miner/policy.go

actors/builtin/miner/miner_actor.go

anorth · 2021-01-12T09:21:50Z

actors/builtin/miner/miner_test.go

+			},
+			exitcode.Ok)
+
+		rt.ExpectAbortContainsMessage(exitcode.ErrForbidden, "can only dispute window posts during the dispute window", func() {


Nit: you should be able to call actor.disputeWindowPoSt inside rt.ExpectAbortContainsMessage, and I would expect that to encapsulate all the expectation set-up above that you had to replicate. Reset() the mock afterwards some some expectations won't have been reached.

Actually, this is a bit of a hole in our tests that we should probably fix. I'm trying to test that the expectations are met, even on the failure case. Otherwise, we can get gas mismatches on failure cases.

Fair. I hope that some kind of golden file approach in the future will abstract this away. Your call in the meantime.

Yeah, I agree that golden files will make this kind of thing unnecessary.

actors/builtin/miner/miner_test.go

* The reward ensures that, at a minimum, the gas fees for disputing a proof is covered. * The base penalty ensures that the dispute method cannot be used to "take an advance" on vesting rewards.

actors/builtin/miner/monies.go

This patch implements two changes: 1. New Window PoSts, except those that restore faulty sectors, are optimistically accepted, assumed correct, and recorded in the state-tree for one proving period. 2. Optimistically accepted window posts can be disputed until `WPoStProofDisputeWindow` epochs after the challenge window in which they were submitted closes. When a dispute successfully refutes an optimistically accepted Window PoSt, the miner is fined one IPF per active sector in the partition (at the moment when the proof was submitted) plus a flat fee of 20FIL, all incorrectly proved sectors are marked faulty, and the disputer (address that submitted the dispute) is rewarded a fixed `DipsuteReward`.

Stebalien changed the title ~~Implement off-chain Window PoSt verificatoin~~ Implement off-chain Window PoSt verification Dec 11, 2020

anorth mentioned this pull request Dec 13, 2020

Remove most v2-only code paths #1330

Merged

anorth reviewed Dec 14, 2020

View reviewed changes

wadealexc reviewed Dec 15, 2020

View reviewed changes

actors/builtin/miner/miner_actor.go Outdated Show resolved Hide resolved

Stebalien force-pushed the refactor/wd-post branch from dc3ddd1 to 63004e3 Compare December 16, 2020 01:11

wadealexc reviewed Dec 16, 2020

View reviewed changes

actors/builtin/miner/deadline_state.go Show resolved Hide resolved

wadealexc reviewed Dec 16, 2020

View reviewed changes

Stebalien added 15 commits December 16, 2020 15:35

update state

f8335f3

update window post submission to record but not verify

6fd7904

handle proofs update at the end of the deadline

6a7516b

initial challenge windowed post implementation

715bc72

handle compaction

de07ccb

fix compile

1d3801f

flesh out ChallengeWindowedPoSt

176f752

only clear out checked proof

a98d741

start fixing tests

19ae9f6

get tests passing

b087f5f

fix lint errors

a6f68f2

use correct sectors array for marking faults

0e7e691

re-organize proof recording logic

f2fe339

Re-organize proof recording logic to move it inside RecordProvenSectors. This means we can test it in the deadline invariant check.

forbid terminating sectors in a deadline being proved

fc21f82

fix tests

53387d5

Stebalien added 3 commits January 7, 2021 17:20

set the correct window post penalty

396fe71

add logic to calculate and pay out rewards

7d3f862

Currently, the payout is zero. However, the logic now exists and we can discuss the ramifications.

remove todo

b442ace

Stebalien commented Jan 11, 2021

View reviewed changes

Stebalien added 6 commits January 11, 2021 16:02

remove TODO

bcdb643

disallow dispute before proving period start

bdd4a61

remove another todo

bca44ac

make sure we don't record proofs that restore power

186d898

These need to be checked up-front.

add tests for end of dispute window

3d2f31a

test terminating sectors while proving

d8c1378

anorth approved these changes Jan 12, 2021

View reviewed changes

Stebalien added 6 commits January 12, 2021 09:27

add note about partitions posted bitfield

5de41bd

assert additional deadline invariant

e70433b

remove stray comment

6bf5bb4

address some more review comments

9bd5e27

Merge branch 'master' into refactor/wd-post

0ead8da

test some additional dispute cases

9ca6d38

Stebalien force-pushed the refactor/wd-post branch 2 times, most recently from b371ebf to 8bb8458 Compare January 14, 2021 21:11

implement reward and add a base penalty

d07f47f

* The reward ensures that, at a minimum, the gas fees for disputing a proof is covered. * The base penalty ensures that the dispute method cannot be used to "take an advance" on vesting rewards.

anorth suggested changes Jan 14, 2021

View reviewed changes

actors/builtin/miner/monies.go Outdated Show resolved Hide resolved

Stebalien force-pushed the refactor/wd-post branch from 8bb8458 to d07f47f Compare January 14, 2021 22:26

fix fee base

09d2b58

anorth approved these changes Jan 15, 2021

View reviewed changes

Merge branch 'master' into refactor/wd-post

9f781c4

Stebalien merged commit be9804e into master Jan 15, 2021

ZenGround0 mentioned this pull request Apr 8, 2021

Actors v5 Plan #1391

Closed

28 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement off-chain Window PoSt verification #1327

Implement off-chain Window PoSt verification #1327

Stebalien commented Dec 10, 2020 •

edited

Loading

Stebalien commented Dec 10, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 12, 2020

Stebalien commented Dec 12, 2020

anorth commented Dec 13, 2020

anorth left a comment

Stebalien commented Dec 14, 2020

wadealexc Dec 16, 2020

Stebalien Dec 17, 2020

wadealexc Dec 18, 2020

Stebalien Dec 18, 2020

Stebalien Jan 11, 2021

Stebalien Jan 11, 2021

Stebalien Jan 11, 2021

anorth Jan 12, 2021

Stebalien Jan 12, 2021

anorth Jan 12, 2021

Stebalien Jan 13, 2021

Stebalien commented Jan 11, 2021

Stebalien commented Jan 11, 2021

anorth Jan 12, 2021

Stebalien Jan 13, 2021

anorth Jan 12, 2021

anorth Jan 12, 2021

Stebalien Jan 12, 2021

anorth Jan 13, 2021

Stebalien Jan 13, 2021

Implement off-chain Window PoSt verification #1327

Implement off-chain Window PoSt verification #1327

Conversation

Stebalien commented Dec 10, 2020 • edited Loading

TODO

Changes

Change: SubmitWindowedPoSt

Change: Deadline End

Change: CompactPartitions

New: DisputeWindowedPoSt

Stebalien commented Dec 10, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 11, 2020

Stebalien commented Dec 12, 2020

Stebalien commented Dec 12, 2020

anorth commented Dec 13, 2020

anorth left a comment

Choose a reason for hiding this comment

Stebalien commented Dec 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Jan 11, 2021

Stebalien commented Jan 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Dec 10, 2020 •

edited

Loading