Improve validation during `read_ledger` chunk parsing #7492

eddyashton · 2025-11-28T14:34:01Z

Exploring behaviour with various corrupt ledger chunks spotted a significant regression from 5.x to 6.x - a truncation that caused a completely missing offsets table was previously (correctly) an error, would now seem to read correctly (albeit returning an empty file).

This adds 2 extra checks on file open, and some regression tests by manually corrupting some of our golden files.

This reverts commit d27bbab.

…ger_offset_table_validation

Copilot

Pull request overview

This PR improves validation during ledger chunk parsing to address a regression from version 5.x to 6.x. Previously, a truncated ledger chunk with a completely missing offsets table would incorrectly appear to read successfully (returning an empty file) instead of raising an error.

Key Changes:

Added validation to check that the offset table position claimed in the file header doesn't exceed the actual file size
Added validation to verify the number of transactions found in the file matches the expected count from the filename
Added comprehensive regression tests covering 9 different corruption scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`python/src/ccf/ledger.py`	Adds two new validation checks in `LedgerChunk.__init__`: verifies offset table position is within file bounds and transaction count matches filename expectations
`tests/e2e_operations.py`	Adds regression tests that corrupt ledger chunks in 9 different ways and verify appropriate error messages are raised

tests/e2e_operations.py

python/src/ccf/ledger.py

…ger_offset_table_validation

tests/e2e_operations.py

achamayou

LGTM but needs a Python format

(cherry picked from commit 52751f1)

…unk parsing (#7492) (#7501)

eddyashton added 7 commits November 28, 2025 11:03

Raise error if offsets table is larger than file

b6f4ad2

Raise an error if the chunk contains fewer transactions than expected

7c92b2d

File name range is inclusive

630832c

Move full service testdata to subdirectory

d27bbab

Revert "Move full service testdata to subdirectory"

67c91f7

This reverts commit d27bbab.

Add regression tests for various chunk parsing errors

5b02373

Merge branch 'main' of https://github.com/microsoft/CCF into read_led…

79e1a50

…ger_offset_table_validation

eddyashton requested a review from a team as a code owner November 28, 2025 14:34

Copilot AI review requested due to automatic review settings November 28, 2025 14:34

Copilot started reviewing on behalf of eddyashton November 28, 2025 14:34 View session

Copilot finished reviewing on behalf of eddyashton November 28, 2025 14:38

Copilot AI reviewed Nov 28, 2025

View reviewed changes

tests/e2e_operations.py Outdated Show resolved Hide resolved

tests/e2e_operations.py Outdated Show resolved Hide resolved

python/src/ccf/ledger.py Outdated Show resolved Hide resolved

eddyashton added 5 commits November 28, 2025 14:50

Add test case truncating exactly _at_ offset table

fa3d390

Dead formatter

53cf40a

You can call me Al

18e3a35

Merge branch 'main' of https://github.com/microsoft/CCF into read_led…

a1fa908

…ger_offset_table_validation

Add test case of null-block, by reading public domain

b009db1

eddyashton commented Nov 28, 2025

View reviewed changes

tests/e2e_operations.py Show resolved Hide resolved

achamayou approved these changes Nov 28, 2025

View reviewed changes

again

b647774

achamayou added auto-backport Automatically backport this PR to LTS branch 6.x-todo PRs which should be backported to 6.x labels Nov 28, 2025

achamayou merged commit 52751f1 into microsoft:main Nov 28, 2025
39 checks passed

eddyashton added a commit to eddyashton/CCF that referenced this pull request Dec 2, 2025

Improve validation during read_ledger chunk parsing (microsoft#7492)

5c66a05

(cherry picked from commit 52751f1)

eddyashton mentioned this pull request Dec 2, 2025

[release/6.x] Cherry pick: Improve validation during read_ledger chunk parsing (#7492) #7501

Merged

eddyashton added the backported This PR was successfully backported to LTS branch label Dec 2, 2025

eddyashton added a commit that referenced this pull request Dec 2, 2025

[release/6.x] Cherry pick: Improve validation during read_ledger ch…

81999f7

…unk parsing (#7492) (#7501)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve validation during `read_ledger` chunk parsing #7492

Improve validation during `read_ledger` chunk parsing #7492

Uh oh!

eddyashton commented Nov 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achamayou left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve validation during read_ledger chunk parsing #7492

Improve validation during read_ledger chunk parsing #7492

Uh oh!

Conversation

eddyashton commented Nov 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achamayou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve validation during `read_ledger` chunk parsing #7492

Improve validation during `read_ledger` chunk parsing #7492