Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(kafka-reader): Separates kafka constructs for better separation of concerns & reuse. #14982

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

owen-d
Copy link
Member

@owen-d owen-d commented Nov 16, 2024

Refactor Kafka Reader and Partition Committer

This PR refactors the Kafka ingestion pipeline components to achieve better separation of concerns and cleaner interfaces. The changes split the existing Reader into three distinct components:

  1. A ReaderIfc that handles pure Kafka interactions
  2. A partitionCommitter that manages async offset commits (already existed, but I've refactored it to use the interface)
  3. A ReaderService that coordinates the overall lifecycle

Key Changes

Note: As of now, this just adds 3 new files with refactored versions -- if we're happy I'll replace the existing ones, but the current structure mirrors how I developed against the prior versions.

Disclaimer: I haven't hooked this up to testware yet; if we're happy with this direction, I'll do that next. edit: done.

New Reader Interface

Created a focused interface for Kafka operations:

type ReaderIfc interface {
    Topic() string
    Partition() int32
    ConsumerGroup() string
    FetchLastCommittedOffset(ctx context.Context) (int64, error)
    FetchPartitionOffset(ctx context.Context, position SpecialOffset) (int64, error)
    Poll(ctx context.Context) ([]Record, error)
    Commit(ctx context.Context, offset int64) error
    // Set the target offset for consumption. reads will begin from here.
    SetOffsetForConsumption(offset int64)
}

Improved Committer Design

  • Moved the committer to depend on ReaderIfc instead of containing Kafka logic
  • Maintains async commit functionality but delegates actual commits to the reader
  • Cleaner separation between offset management and Kafka operations
  • Added explicit SpecialOffset type to make partition offset fetching more type-safe

Service Lifecycle

  • ReaderService now coordinates between the reader and committer
  • Clearer separation of metrics between components
  • More explicit error handling throughout

Benefits

  • Maintainability: Responsibilities are clearly separated between Kafka operations, offset management, and service lifecycle
  • Flexibility: The ReaderIfc can be used independently of the service wrapper, which is how I started down this path. I'd like to reuse the underlying reader in the block-builder code.

@owen-d owen-d requested a review from a team as a code owner November 16, 2024 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant