You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the TimestampValidationEpoch hardfork, validators check parent.Time < header.Time <= now + 15s. This bounds individual blocks but does not bound how much chain time can be ratcheted forward of real wall-clock
time by a series of clock-skewed leaders. The result is a class of liveness and
view-change problems that are reachable today on any network where the fork is
active, and the situation is made easier by the removal of the NTP correction
in #5042.
Important
This is a consensus-layer issue. A fix will require a new hardfork epoch
(sibling to TimestampValidationEpoch) to roll out safely.
The leader writes header.Time = time.Now().Unix() verbatim.
View-change leader election uses block timestamp
consensus/view_change.go::getNextViewID:
blockTimestamp:=curHeader.Time().Int64()
curTimestamp:=time.Now().Unix()
ifcurTimestamp<=blockTimestamp {
// silent fallback to a non-deterministic algorithmreturnpm.fallbackNextViewID()
}
diff:=uint64((curTimestamp-blockTimestamp)/viewChangeSlot+1)
nextViewID:=diff+stuckBlockViewID
with viewChangeSlot = 45s, viewChangeDuration = 27s.
The problem — cascading clock skew
The header.Time <= now + 15s ceiling is reusable on every block. There is
no rule that bounds how much each block can advance chain time relative to its
parent.
Failure scenario A — liveness loss for slow-clock validators
Let real network time be T, allowedFutureBlockTime = 15s.
Leader A has clock +15s. A proposes block N with header.Time = T+15.
Validator B with +0s skew checks T+15 <= B.now + 15 = T+15 → ✅ accepts.
Validator C with -1s skew checks T+15 <= C.now + 15 = T+14 → ❌ ErrFutureBlock.
C does not vote prepare. If enough validators have any non-positive skew,
the prepare quorum is short → 27 s timeoutConsensus fires → view change.
Failure scenario B — ratcheted chain time, persistent honest stall
Continuing from A, suppose A's block did commit:
parent.Time = T+15.
Next leader E (honest, synced clock). E's now = T+2 (one BlockPeriod after commit).
proposer.go: timestamp (T+2) <= parentTime (T+15). E sleeps until time.Unix(T+16, 0) — 13 seconds of forced wait.
E proposes header.Time = T+16.
Validator C (-1s clock) checks T+16 <= C.now+15 = T+15 → ❌ still rejects.
The +15s skew has been transferred from leader A into the chain itself.
Validator C cannot participate until its own wall clock catches up to T+15.
Every subsequent leader pays a per-block sleep penalty roughly equal to the
accumulated skew.
Failure scenario C — view-change determinism breaks
getNextViewID is the deterministic mechanism that lets validators agree on
the next leader without coordination. When curTimestamp <= blockTimestamp,
it silently falls back to fallbackNextViewID(), which computes the next
view ID from viewChangingID (non-deterministic across validators).
Under sustained cascading skew, validators with slow clocks systematically
take the fallback path while validators with fast clocks take the timestamp
path. They can pick different next leaders for the same view change,
requiring extra view-change rounds to converge.
There is no upper bound on the sleep. If parent.Time is far in the future
(legitimate cascading skew, or a misconfigured leader), the proposer goroutine
sleeps indefinitely. consensus.readySignal is an unbuffered channel:
consensus.readySignal=make(chanProposal)
Anyone calling ReadySignal (view-change completion, setupForNewConsensus, finalCommit) blocks during the sleep, which can stall view-change recovery
on top of the original stall.
Impact surface
Area
Affected
Mechanism
Consensus liveness
Yes
ErrFutureBlock during BlockVerifier drops slow-clock validators out of the prepare quorum.
View changes
Yes
getNextViewID falls back to non-deterministic algorithm; different validators elect different next leaders.
Leader rotation
Indirect
NthNext* is computed over diverging view IDs.
Epoch transitions
Indirect
Stretched gap between last block of epoch and first block of next epoch delays staking-reward calc, shard-state propagation, crosslink injection.
Transaction processing
Yes
block.timestamp EVM opcode visible to contracts; AsyncBlockProposalTimeout = 9s can fire during proposer sleep.
Block import / sync
Yes
ErrFutureBlock halts the insert batch (core/blockchain_impl.go::insertChain). Sync peers get stuck on the offending block until their clock catches up.
Crosslinks
Indirect
Beacon chain crosslink confirmation delayed by stalled shard chain.
Fast finality (1 s blocks)
Acute
With BlockPeriod = 1s and allowedFutureBlockTime = 15s, a single +15s leader locks the chain ahead of real time by 15 block-periods.
Attack model
Attacker capability
Today
Set own clock +15s while leader
Pushes chain time +15s; validators with normal clock vote; validators with -ε clock drop out. No cost to attacker.
Set own clock -15s while leader
Block has stale timestamp; still accepted (proposer sleeps to make header.Time > parent.Time).
Repeat across leader slots
Chain time persistently runs ahead of real time at the +15s ceiling; honest leaders pay a per-attack stall cost.
Drive validators with -ε clocks out of quorum
Achievable by a single +15s leader if any slow validators exist.
There is no slashing for any of these behaviors. Cost to attacker = 0;
cost to network = sustained skew + intermittent view changes + smart-contract block.timestamp drift.
Constraints any fix must respect
Deterministic across validators. Any rule that depends on each
validator's time.Now() is OK only if it is robust to ±1–2 s wall-clock
skew between honest validators.
Survives legitimate stalls. Mainnet can stall for several minutes
(view-change storms, partitions). After the stall, chain time must be able
to re-align with real time within a small number of blocks — ideally one.
❌ Removes the only existing protection against unbounded chain-time drift.
Each block can compound +15s. Chain time can race arbitrarily far ahead of
real time.
Approach 2 — Bounded per-block forward step
Add a step rule on top of the existing wall-clock ceiling:
header.Time <= parent.Time + maxStep AND header.Time <= now + 15s
with maxStep in the 10–30 s range.
✅ Mathematically bounds cascading skew at maxStep per block.
✅ Honest leaders with synced clocks never trip it.
⚠️ Stall recovery is slow: chain time catches up at maxStep per block.
A 5-minute stall takes 300/maxStep blocks to re-align block.timestamp
with real time.
Approach 3 — Bounded step with wall-clock fallback (recommended)
header.Time <= max(parent.Time + maxStep, now) AND header.Time <= now + 15s
In normal operation (parent ≈ now − BlockPeriod), parent + maxStep > now,
so the step arm is binding. Cascading skew is bounded.
In stall recovery (parent << now), now > parent + maxStep, so the wall
arm is binding. The block can jump straight to now — single-block recovery.
Leader-side mirror: clamp proposed timestamp to parent + maxStep only when
wall clock is moderately ahead of parent (clock skew). Above a threshold
(e.g. viewChangeTimeout + buffer = 30s) treat as stall and produce at wall.
Approach 4 — View-ID-aware step
Use header.ViewID - parent.ViewID (deterministic from header fields) to
compute an allowed step. Each missed view (view-change round) extends the
allowed step by viewChangeTimeout.
✅ Fully deterministic across validators.
✅ Handles arbitrary-length stalls in a single block.
❌ More invasive change; couples engine validation to viewID semantics.
Approach 5 — Cap proposer catch-up sleep (complementary)
Independent of the validation rule: cap the time.Sleep in proposer.go so
the proposer goroutine cannot be hung indefinitely on a far-future parent.
Doesn't fix the underlying skew but removes the secondary blocking of readySignal.
Suggested combination
Approach 3 (bounded step with wall-clock fallback) + Approach 5 (cap
proposer sleep) under a new hardfork field BoundedTimestampStepEpoch. Also
fix getNextViewID to clamp curTimestamp = max(time.Now().Unix(), parent.Time)
so the deterministic path is always taken.
Open questions
What maxStep value? Candidates: 2 * BlockPeriod + 1s (per-epoch), or
a network-wide constant (e.g. 12 s) larger than the largest BlockPeriod.
What stall-detection threshold for the proposer's clamp-vs-no-clamp
decision? viewChangeTimeout (27s) + 3s = 30s is the natural choice.
Re-introduce NTP correction for proposal only (not validation) so
honest leaders track real time more tightly? Or accept that with bounded
step NTP is not strictly necessary?
Activation strategy — new epoch field, or piggyback on a future TimestampValidationEpoch activation on networks where it hasn't fired
yet? (Currently only Partner/Devnet has activated it.)
A small two-node localnet harness that wraps time.Now() with a configurable
offset should be sufficient to reproduce Scenario A on current main with TimestampValidationEpoch active (Partner network or localnet). A scripted
skew schedule across a 4-node committee reproduces Scenarios B and C.
Happy to author a reproducer harness as a follow-up PR if useful.
Acceptance criteria for a fix
New hardfork field gating the new rule (no behavior change before activation).
Cascading skew bounded at a small constant per block (≤ 2 × BlockPeriod).
Single-block recovery from arbitrarily long stalls.
Proposer goroutine never blocked > N seconds on a far-future parent (configurable cap).
getNextViewID stays on the deterministic timestamp path.
TL;DR
After the
TimestampValidationEpochhardfork, validators checkparent.Time < header.Time <= now + 15s. This bounds individual blocks but doesnot bound how much chain time can be ratcheted forward of real wall-clock
time by a series of clock-skewed leaders. The result is a class of liveness and
view-change problems that are reachable today on any network where the fork is
active, and the situation is made easier by the removal of the NTP correction
in #5042.
Important
This is a consensus-layer issue. A fix will require a new hardfork epoch
(sibling to
TimestampValidationEpoch) to roll out safely.Background — current timestamp logic
Header verification
internal/chain/engine.go:with
allowedFutureBlockTime = 15 * time.Second.Leader timestamp selection
consensus/proposer.go(after #5042 removed runtime NTP correction):The leader writes
header.Time = time.Now().Unix()verbatim.View-change leader election uses block timestamp
consensus/view_change.go::getNextViewID:with
viewChangeSlot = 45s,viewChangeDuration = 27s.The problem — cascading clock skew
The
header.Time <= now + 15sceiling is reusable on every block. There isno rule that bounds how much each block can advance chain time relative to its
parent.
Failure scenario A — liveness loss for slow-clock validators
Let real network time be
T,allowedFutureBlockTime = 15s.+15s. A proposes block N withheader.Time = T+15.+0sskew checksT+15 <= B.now + 15 = T+15→ ✅ accepts.-1sskew checksT+15 <= C.now + 15 = T+14→ ❌ErrFutureBlock.C does not vote
prepare. If enough validators have any non-positive skew,the
preparequorum is short → 27 stimeoutConsensusfires → view change.Failure scenario B — ratcheted chain time, persistent honest stall
Continuing from A, suppose A's block did commit:
parent.Time = T+15.now = T+2(oneBlockPeriodafter commit).proposer.go:timestamp (T+2) <= parentTime (T+15). E sleeps untiltime.Unix(T+16, 0)— 13 seconds of forced wait.header.Time = T+16.-1sclock) checksT+16 <= C.now+15 = T+15→ ❌ still rejects.The
+15sskew has been transferred from leader A into the chain itself.Validator C cannot participate until its own wall clock catches up to
T+15.Every subsequent leader pays a per-block sleep penalty roughly equal to the
accumulated skew.
Failure scenario C — view-change determinism breaks
getNextViewIDis the deterministic mechanism that lets validators agree onthe next leader without coordination. When
curTimestamp <= blockTimestamp,it silently falls back to
fallbackNextViewID(), which computes the nextview ID from
viewChangingID(non-deterministic across validators).Under sustained cascading skew, validators with slow clocks systematically
take the fallback path while validators with fast clocks take the timestamp
path. They can pick different next leaders for the same view change,
requiring extra view-change rounds to converge.
Failure scenario D —
proposer.goindefinite sleepThere is no upper bound on the sleep. If
parent.Timeis far in the future(legitimate cascading skew, or a misconfigured leader), the proposer goroutine
sleeps indefinitely.
consensus.readySignalis an unbuffered channel:Anyone calling
ReadySignal(view-change completion,setupForNewConsensus,finalCommit) blocks during the sleep, which can stall view-change recoveryon top of the original stall.
Impact surface
ErrFutureBlockduringBlockVerifierdrops slow-clock validators out of thepreparequorum.getNextViewIDfalls back to non-deterministic algorithm; different validators elect different next leaders.NthNext*is computed over diverging view IDs.block.timestampEVM opcode visible to contracts;AsyncBlockProposalTimeout = 9scan fire during proposer sleep.ErrFutureBlockhalts the insert batch (core/blockchain_impl.go::insertChain). Sync peers get stuck on the offending block until their clock catches up.BlockPeriod = 1sandallowedFutureBlockTime = 15s, a single+15sleader locks the chain ahead of real time by 15 block-periods.Attack model
+15swhile leader+15s; validators with normal clock vote; validators with-εclock drop out. No cost to attacker.-15swhile leaderheader.Time > parent.Time).+15sceiling; honest leaders pay a per-attack stall cost.-εclocks out of quorum+15sleader if any slow validators exist.There is no slashing for any of these behaviors. Cost to attacker = 0;
cost to network = sustained skew + intermittent view changes + smart-contract
block.timestampdrift.Constraints any fix must respect
validator's
time.Now()is OK only if it is robust to ±1–2 s wall-clockskew between honest validators.
(view-change storms, partitions). After the stall, chain time must be able
to re-align with real time within a small number of blocks — ideally one.
correction. A fix should work with raw
time.Now().field (sibling to
TimestampValidationEpoch) so rollout is coordinated.getNextViewID— or, if not, fix that functionin the same hardfork.
Possible approaches
Approach 1 — Parent-aware skew "carry" (not recommended)
limit = max(time.Now()+15s, parent.Time + 15s).Each block can compound
+15s. Chain time can race arbitrarily far ahead ofreal time.
Approach 2 — Bounded per-block forward step
Add a step rule on top of the existing wall-clock ceiling:
with
maxStepin the 10–30 s range.maxStepper block.maxStepper block.A 5-minute stall takes
300/maxStepblocks to re-alignblock.timestampwith real time.
Approach 3 — Bounded step with wall-clock fallback (recommended)
parent ≈ now − BlockPeriod),parent + maxStep > now,so the step arm is binding. Cascading skew is bounded.
parent << now),now > parent + maxStep, so the wallarm is binding. The block can jump straight to
now— single-block recovery.parent + maxSteponly whenwall clock is moderately ahead of parent (clock skew). Above a threshold
(e.g.
viewChangeTimeout + buffer = 30s) treat as stall and produce at wall.Approach 4 — View-ID-aware step
Use
header.ViewID - parent.ViewID(deterministic from header fields) tocompute an allowed step. Each missed view (view-change round) extends the
allowed step by
viewChangeTimeout.Approach 5 — Cap proposer catch-up sleep (complementary)
Independent of the validation rule: cap the
time.Sleepinproposer.gosothe proposer goroutine cannot be hung indefinitely on a far-future parent.
Doesn't fix the underlying skew but removes the secondary blocking of
readySignal.Suggested combination
Approach 3 (bounded step with wall-clock fallback) + Approach 5 (cap
proposer sleep) under a new hardfork field
BoundedTimestampStepEpoch. Alsofix
getNextViewIDto clampcurTimestamp = max(time.Now().Unix(), parent.Time)so the deterministic path is always taken.
Open questions
maxStepvalue? Candidates:2 * BlockPeriod + 1s(per-epoch), ora network-wide constant (e.g. 12 s) larger than the largest
BlockPeriod.decision?
viewChangeTimeout (27s) + 3s = 30sis the natural choice.honest leaders track real time more tightly? Or accept that with bounded
step NTP is not strictly necessary?
TimestampValidationEpochactivation on networks where it hasn't firedyet? (Currently only Partner/Devnet has activated it.)
References
Relevant files:
internal/chain/engine.go—VerifyHeaderconsensus/proposer.go—WaitForConsensusReadyV2consensus/view_change.go—getNextViewID,fallbackNextViewIDconsensus/config.go—viewChangeTimeout,viewChangeSlotcore/blockchain_impl.go—insertChain(ErrFutureBlockhandling)internal/params/config.go—TimestampValidationEpochReproducer (sketch)
A small two-node localnet harness that wraps
time.Now()with a configurableoffset should be sufficient to reproduce Scenario A on current
mainwithTimestampValidationEpochactive (Partner network or localnet). A scriptedskew schedule across a 4-node committee reproduces Scenarios B and C.
Happy to author a reproducer harness as a follow-up PR if useful.
Acceptance criteria for a fix
2 × BlockPeriod).getNextViewIDstays on the deterministic timestamp path.recovery (short/medium/long), cascading-skew aftermath, unknown ancestor.