Skip to content

Improve performance of arrayDiff #1737

Open
@mhahnenberg

Description

@mhahnenberg

Is your feature request related to a problem? Please describe.
We've observed arrayDiff in some profiling we've done in production when subscribing to a topic with a nontrivial number of partitions (e.g. > 1000) with a relatively high fetch rate. After checking out the code it appears that arrayDiff is written in a way that is O(n*m) since it's iterating over a and calling b.indexOf() for each element.

Describe the solution you'd like
Using a Set could be an easy way to make each of the checks inside the loop O(1) instead of O(n) with indexOf, thus making the overall function O(n+m) rather than O(n*m).

Additional context
I've written a simple PR that does this, will attach after filing this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions