Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: make kgo-verifier assert contiguous offsets on non-compacted topics #11227

Open
jcsp opened this issue Jun 6, 2023 · 3 comments
Open
Labels
area/cloud-storage Shadow indexing subsystem area/tests kind/enhance New feature or request

Comments

@jcsp
Copy link
Contributor

jcsp commented Jun 6, 2023

We should default to expecting that kafka offsets on consumed data will be contiguous, unless transactions and/or compaction is in use.

This would help to detect issues like #10782 , when combined with fault injection.

JIRA Link: CORE-1331

@jcsp jcsp added kind/enhance New feature or request area/tests area/cloud-storage Shadow indexing subsystem labels Jun 6, 2023
@nvartolomei
Copy link
Contributor

Since we increment valid reads, and tests assert that num produced == valid reads, don't we have this covered already?

If we consume a record twice then valid reads will be larger than num produced. If we are missing records then valid reads will be less than num produced.

What's missing? @VladLazar

@VladLazar
Copy link
Contributor

I think what's missing is an end of test validation. I don't believe we automatically fail the test on the conditions you've highlighted.

@nvartolomei
Copy link
Contributor

There are a lot of cases where continuity of offsets is ok to be violated as mentioned. We can't easily verify whether continuity is required or not from the kgo-verifier side. Unless, we add a flag and ask the user to specify whether it is expected or not.

However, the users can check easily in their tests the continuity of offsets by asserting the number of valid reads and max consumed offsets. If you get 10 valid reads and 9 is the highest consumed offset then offsets are continuous.

There was only one exception to that case: if offsets rewinded then jumped over a gap equal to rewind distance, you could still get 10 valid reads and 9 would be the highest offset.

                          +- offsets moving backward
                          |     +- gap
                          v     v
Offsets:      0, 1, 2, 3, 2, 3, 6, 7, 8, 9
Data:         0, 1, 2, 3, 2, 3, 4, 7, 8, 9
Read count:   0, 1, 2, 3, 4, 5, 6, 7, 8, 9

This won't be a problem after redpanda-data/kgo-verifier#45

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/tests kind/enhance New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants