We do custom partition/offset tracking, so we run our consumers with enable.auto.commit=False and group.id=None and we call Consumer.subscribe with a callback set for on_assign. Usually, our initial call to subscribe results in an empty partitions parameter to the on_assign callback. We added retry logic around our subscribe calls, and we generally get a valid partitions list in on_assign within a minute or so.
From this, it seems like initially the Consumer isn't really ready to field subscribe calls with on_assign but eventually becomes ready (perhaps related to initial metadata requests running in the background?). Is there some way for users to determine this readiness and delay calls to subscribe? Our retries are a bit hacky and I'd like to remove them.
Or, is our usage horribly wrong, requiring a new approach to our offset management?