Add method to fetch messages in batch #1390

krsoninikhil · 2025-06-08T07:58:36Z

Since FetchMessage is already reading messages from a fetched batch, this new method just hold the messages util the batchSize number of messages are read.

Fixes #123

go.mod

krsoninikhil · 2025-06-08T08:12:47Z

reader.go

+		i++
+	}
+	return msgBatch, nil
+}


Duplication of the code can be avoided by calling this method in FetchMessage. I'll refactor if once the approach gets reviewed.

What happens if the number of messages doesn't reach the desired batchSize?

you changed the offset when the batch is processed, what happens if one of the messages in the batch fails? Is there any mechanism in place to handle that? Do you have any ideas for a fallback strategy for this?!

@krsoninikhil

@ghaninia for batch processing possible use manuall ack, maybe?

If one of message failed, we can ack all messages before failed, except message with problem.

what happens if one of the messages in the batch fails

We should let consumer decide on how they want to handle it. They can commit or read again from the last commit. Let me know if there is a better approach to handle this.

if one of message failed, we can ack all messages before failed

This sounds good, we can do this. My only concern if consumer is processing batch by batch, it might confusing behavior that a part of the batch committed it's neither abort nor fully committed.

What happens if the number of messages doesn't reach the desired batchSize?

I see, if the current code changes look okay, I can add a ticker with a timeout for a maximum wait time. So it would be returning if there are some messages available but not the full batch.

@max107 @ghaninia let me know your thoughts.

@krsoninikhil in my opinion

This sounds good, we can do this. My only concern if consumer is processing batch by batch, it might confusing behavior that a part of the batch committed it's neither abort nor fully committed.

It's absolute normal behavior. So if some process can't handle message correctly then should raise panic / return error / stop consuming messages and ack last successfully processed message. Whats next? Restart consumer? Raise panic? It's based on developer decision.

So we can imagine next situation - we receive 3 of 10 (batchSize) messages, we are not exceed deadline, we successfully handle first 2 messages, but failed on 3th message. We know, we can't handler 3 message so we can ack 1 and 2. In next attempt we receive 3,4,5... etc messages and can try again.

I see, if the current code changes look okay, I can add a ticker with a timeout for a maximum wait time. So it would be returning if there are some messages available but not the full batch.

also sounds very good, because all "batch processing" it's compromise between timeout and batchSize

Sorry, english is not my first language.

So, my simple batch consumer looks like:

package kafkamux import ( "context" "errors" "sync" "time" "github.com/rs/zerolog/log" "github.com/segmentio/kafka-go" ) var ( ErrSmallQueueCapacity = errors.New("batch lower than queue capacity") ) func NewBatchConsumer( reader Reader, batchSize int, duration time.Duration, ) (*BatchConsumer, error) { if r, ok := reader.(*kafka.Reader); ok && r.Config().QueueCapacity < batchSize { return nil, ErrSmallQueueCapacity } return &BatchConsumer{ reader: reader, batchSize: batchSize, messages: make([]kafka.Message, 0, batchSize), duration: duration, }, nil } type BatchConsumer struct { reader Reader batchSize int duration time.Duration l sync.Mutex messages []kafka.Message } func (b *BatchConsumer) flush(ctx context.Context, fn handler.BatchCallback) error { l := log.Ctx(ctx) b.l.Lock() defer b.l.Unlock() if len(b.messages) == 0 { return nil } if err := fn(ctx, b.messages); err != nil { l.Err(err).Msg("error in callback") return werr.Wrap(err) } if err := b.reader.CommitMessages(ctx, b.messages...); err != nil { l.Err(err).Msg("error in commit messages") return werr.Wrap(err) } b.messages = make([]kafka.Message, 0, b.batchSize) return nil } func (b *BatchConsumer) Listen(ctx context.Context, fn handler.BatchCallback) error { l := log.Ctx(ctx) errCh := make(chan error, 1) msgCh := make(chan kafka.Message, b.batchSize) ticker := time.NewTicker(b.duration) defer ticker.Stop() go func() { defer close(msgCh) for { msg, err := fetchMessage(ctx, b.reader) if err != nil { errCh <- err return } msgCh <- msg } }() for { select { case readErr := <-errCh: l.Err(readErr).Msg("read message error, stop main loop") return nil case <-ctx.Done(): l.Debug().Msg("context done, stop main loop") return nil case <-ticker.C: l.Debug().Int("messages_count", len(b.messages)).Msg("ticker flush") if err := b.flush(ctx, fn); err != nil { l.Err(err).Msg("error flushing messages") return werr.Wrap(err) } case msg, ok := <-msgCh: if !ok { continue } b.messages = append(b.messages, msg) if len(b.messages) < b.batchSize { l.Debug().Int("messages_count", len(b.messages)).Msg("not enough messages, wait") continue } l.Info().Int("messages_count", len(b.messages)).Msg("main loop flush") if err := b.flush(ctx, fn); err != nil { l.Err(err).Msg("error flushing messages") return werr.Wrap(err) } ticker.Reset(b.duration) } } }

this ^ consumer or ack all messages or do nothing, because error happened. I'am not sure my solution is correct in generally, but for my project with idempotency it's okay.

So if we can have ability for fetch messages with batch Size, it can help in many situations.

@krsoninikhil please check MR #1395 with deadline timeout support.

Since FetchMessage is already reading messages from a fetched batch, this new method just hold the messages util the batchSize number of messages are read. Fixes segmentio#123

krsoninikhil force-pushed the feat/fetch-batch branch from b770811 to 9938849 Compare June 8, 2025 08:03

krsoninikhil mentioned this pull request Jun 8, 2025

Way to get batch messages and commit if the batch is successful #123

Open

krsoninikhil force-pushed the feat/fetch-batch branch from 9938849 to 1db836a Compare June 8, 2025 08:08

krsoninikhil commented Jun 8, 2025

View reviewed changes

go.mod Outdated Show resolved Hide resolved

krsoninikhil commented Jun 8, 2025

View reviewed changes

Add method to fetch messages in batch

17cc32f

Since FetchMessage is already reading messages from a fetched batch, this new method just hold the messages util the batchSize number of messages are read. Fixes segmentio#123

krsoninikhil force-pushed the feat/fetch-batch branch from 1db836a to 17cc32f Compare June 8, 2025 08:14

krsoninikhil requested review from max107 and ghaninia July 5, 2025 14:52

max107 mentioned this pull request Jul 12, 2025

deadline for batch fetching #1395

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add method to fetch messages in batch #1390

Add method to fetch messages in batch #1390

Uh oh!

krsoninikhil commented Jun 8, 2025

Uh oh!

Uh oh!

krsoninikhil Jun 8, 2025 •

edited

Loading

Uh oh!

ghaninia Jun 8, 2025 •

edited

Loading

Uh oh!

max107 Jun 9, 2025

Uh oh!

krsoninikhil Jul 5, 2025

Uh oh!

krsoninikhil Jul 11, 2025

Uh oh!

max107 Jul 12, 2025

Uh oh!

max107 Jul 12, 2025 •

edited

Loading

Uh oh!

max107 Jul 12, 2025

Uh oh!

Uh oh!

Add method to fetch messages in batch #1390

Are you sure you want to change the base?

Add method to fetch messages in batch #1390

Uh oh!

Conversation

krsoninikhil commented Jun 8, 2025

Uh oh!

Uh oh!

krsoninikhil Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghaninia Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

max107 Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

krsoninikhil Jul 5, 2025

Choose a reason for hiding this comment

Uh oh!

krsoninikhil Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

max107 Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

max107 Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

max107 Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krsoninikhil Jun 8, 2025 •

edited

Loading

ghaninia Jun 8, 2025 •

edited

Loading

max107 Jul 12, 2025 •

edited

Loading