Skip to content

[FLINK-39026][runtime] Remove misleading TODO about getPartitionedState overhead#27517

Open
nateab wants to merge 1 commit intoapache:masterfrom
nateab:remove-misleading-state-todo
Open

[FLINK-39026][runtime] Remove misleading TODO about getPartitionedState overhead#27517
nateab wants to merge 1 commit intoapache:masterfrom
nateab:remove-misleading-state-todo

Conversation

@nateab
Copy link
Contributor

@nateab nateab commented Feb 4, 2026

…te overhead

What is the purpose of the change

Remove TODO comments in AbstractKeyedStateBackend and StreamOperatorStateHandler that incorrectly suggested getPartitionedState() has significant overhead when updating namespaces.

The TODO (added in 2020) claimed:

"This method does a lot of work caching / retrieving states just to update the namespace. This method should be removed for the sake of namespaces being lazily fetched from the keyed state backend, or being set
on the state directly."

Benchmarking revealed this is incorrect. The current implementation already has an efficient fast-path cache, and calling setCurrentNamespace() directly is actually no faster (and sometimes slower) than
the current approach, most likely due to just test noise.

Benchmark Results

Heap Backend (5 trials × 2M iterations each)

getPartitionedState: min=130.4 ns, median=135.1 ns
setCurrentNamespace: min=134.0 ns, median=142.5 ns
"Optimized" direct approach is 5% SLOWER

RocksDB Backend (5 trials × 2M iterations each)

getPartitionedState: min=9006.0 ns, median=9861.0 ns
setCurrentNamespace: min=9414.2 ns, median=10101.1 ns
"Optimized" direct approach is 2.4% SLOWER

Why the Current Code is Already Optimal

The fast-path in getPartitionedState() (lines 434-437) already does minimal work:

if (lastName != null && lastName.equals(stateDescriptor.getName())) {
    lastState.setCurrentNamespace(namespace);
    return (S) lastState;
}

When hitting the cache (common case), it just does String.equals() + setCurrentNamespace() - which is essentially the same as the "optimized" direct approach.

Brief change log

  • Removed misleading TODO comment from AbstractKeyedStateBackend.getPartitionedState()
  • Removed misleading TODO comment from StreamOperatorStateHandler.getPartitionedState()
  • Added NamespaceStateBenchmark.java - benchmark harness for namespace state access patterns
  • Added NamespaceStateBenchmarkTest.java - benchmark runner with multiple trials

Verifying this change

This change added tests and can be verified as follows:

  • Run benchmarks: ./mvnw test -pl flink-test-utils-parent/flink-test-utils -Dtest=NamespaceStateBenchmarkTest -Denforcer.skip=true
  • The benchmark compares getPartitionedState() vs direct setCurrentNamespace() calls across multiple scenarios

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@nateab nateab force-pushed the remove-misleading-state-todo branch from 8ddb873 to e9f1e09 Compare February 4, 2026 07:48
@nateab nateab changed the title [FLINK-16316][runtime] Remove misleading TODO about getPartitionedSta… [FLINK-39026][runtime] Remove misleading TODO about getPartitionedState overhead Feb 4, 2026
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 4, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@nateab nateab force-pushed the remove-misleading-state-todo branch from e9f1e09 to 0a2e053 Compare February 4, 2026 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants