Skip to content

kafka: bias fetch planner to prefer reading partitions with data in SI cache #6681

Open
@jcsp

Description

When a client issues a fetch RPC that touches many partitions, the kafka layer makes a decision about which partitions to include in the result. Currently this decision does not take account of which partitions may have fetch results returned cheaply from the SI cache, vs. which may require an expensive segment hydration.

Fetches should advance somewhat uniformly across all the partitions they are interested in, but within some bounds we may bias this: for example, as long as no partition is more than 1 segment size ahead of the others, we can continue to return results from the partitions we happen to have data handy for.

Without this change, if the segment_size * partition_count > cache_size, then even one consumer reading a topic may experience very bad thrashing of segments in/out of the cache.

This change will make no difference in the case of 1 consumer per partition, but in the case of 1 consumer reading many partitions under cache space pressure, it will be far more efficient.

JIRA Link: CORE-1041

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions