Skip to content

[improve][pip] PIP-453: Improve the metadata store threading model#25173

Merged
BewareMyPower merged 4 commits intoapache:masterfrom
BewareMyPower:bewaremypower/pip-453-meta-thread-model
Jan 26, 2026
Merged

[improve][pip] PIP-453: Improve the metadata store threading model#25173
BewareMyPower merged 4 commits intoapache:masterfrom
BewareMyPower:bewaremypower/pip-453-meta-thread-model

Conversation

@BewareMyPower
Copy link
Contributor

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added PIP doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. labels Jan 21, 2026
@BewareMyPower
Copy link
Contributor Author

"metadata-store-10-1" #27 [72] prio=5 os_prio=0 cpu=17707.20ms elapsed=5258.47s
"configuration-metadata-store-13-1" #31 [76] prio=5 os_prio=0 cpu=17575.64ms elapsed=5257.48s
"bookkeeper-ml-scheduler-OrderedScheduler-0-0" #54 [98] prio=5 os_prio=0 cpu=105.04ms elapsed=5255.36s
"bookkeeper-ml-scheduler-OrderedScheduler-1-0" #55 [99] prio=5 os_prio=0 cpu=138.76ms elapsed=5255.36s
"bookkeeper-ml-scheduler-OrderedScheduler-2-0" #56 [100] prio=5 os_prio=0 cpu=113.28ms elapsed=5255.36s

The metadata store thread even spends much more CPU time than the bookkeeper-ml worker threads.

@lhotari
Copy link
Member

lhotari commented Jan 22, 2026

The metadata store thread even spends much more CPU time than the bookkeeper-ml worker threads.

It's not related to this PIP, but there's also a possibility to save CPU in derialization. Due to consistency reasons, the MetadataStore cache entries expire after 10 minutes. There's a background refresh in use which means that if the entry has been used before it expires, a new refresh will happen in the background between 5 to 10 minutes from the last refresh.
In many cases, there haven't been any changes since the last refresh. Therefore the deserialization step is completely unnecessary when there haven't been any changes. The previous deserialized value could be used instead of deserializing again.

Another detail related to wasted CPU. When an entry gets modified, it would get refreshed 2 times:

@Override
public CompletableFuture<Void> put(String path, T value, EnumSet<CreateOption> options) {
final byte[] bytes;
try {
bytes = serde.serialize(path, value);
} catch (IOException e) {
return CompletableFuture.failedFuture(e);
}
if (storeExtended != null) {
return storeExtended.put(path, bytes, Optional.empty(), options).thenAccept(__ -> refresh(path));
} else {
return store.put(path, bytes, Optional.empty()).thenAccept(__ -> refresh(path));
}
}

public void accept(Notification t) {
String path = t.getPath();
switch (t.getType()) {
case Created:
case Modified:
refresh(path);
break;

@BewareMyPower
Copy link
Contributor Author

There are much room to improve for metadata store. I will open a series of PRs in next few weeks. Regarding the cache, I think it should be okay because the cache refresh interval is 5 minutes, which is long enough. Actually I don't think the cache here makes sense. The metadata store listener is able to update the cache.

What I can think of is that the cache can prevent outdated metadata in case the listener didn't work correctly. But from such perspective, 5 minutes would be too long.

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Jan 22, 2026

BTW, I just ran a round of test with the new threading model (as well as a few improvements to move the compute sensitive tasks out of the metadata store thread).

"metadata-store-serdes-OrderedExecutor-0-0" #27 [83] prio=5 os_prio=0 cpu=288.75ms
"metadata-store-serdes-OrderedExecutor-1-0" #28 [84] prio=5 os_prio=0 cpu=165.31ms
"metadata-store-serdes-OrderedExecutor-2-0" #29 [85] prio=5 os_prio=0 cpu=252.60ms
"metadata-store-batch-flusher-12-1" #30 [86] prio=5 os_prio=0 cpu=1217.19ms
"metadata-store-events-10-1" #59 [114] prio=5 os_prio=0 cpu=333.75ms
"main-EventThread" #32 [88] daemon prio=5 os_prio=0 cpu=89.03ms

Before this change, the tasks executed by batch-flusher (as well as other 3 serdes threads) would be executed by events thread.

@nodece
Copy link
Member

nodece commented Jan 23, 2026

Makes sense. If multiple operators (flush/serialization/deserialization) depend on the same thread, that thread becomes a bottleneck. Using separate threads here looks good to me.

@BewareMyPower
Copy link
Contributor Author

@lhotari @nodece @tjiuming The vote passed, could you help merge this proposal?

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BewareMyPower BewareMyPower merged commit c51346f into apache:master Jan 26, 2026
20 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/pip-453-meta-thread-model branch January 26, 2026 14:39
coderzc pushed a commit that referenced this pull request Jan 29, 2026
hshankar31 pushed a commit to datastax/pulsar that referenced this pull request Feb 5, 2026
hshankar31 pushed a commit to datastax/pulsar that referenced this pull request Feb 16, 2026
priyanshu-ctds pushed a commit to datastax/pulsar that referenced this pull request Feb 18, 2026
…datastax 4 0 ds 16 feb (#589)

* [improve][broker] Ensure metadata session state visibility and improve Unstable observability for ServiceUnitStateChannelImpl (apache#25132)

(cherry picked from commit 2a29be0)
(cherry picked from commit 85dc758)

* [improve][broker] Upgrade bookkeeper to 4.17.3 (apache#25166)

(cherry picked from commit 45def39)
(cherry picked from commit 333110a)

* fix license and pom file

* [fix][ml] Fix NoSuchElementException in EntryCountEstimator caused by a race condition (apache#25177)

(cherry picked from commit 9b70ba3)
(cherry picked from commit 9261869)

* [fix][test] Bump org.assertj:assertj-core from 3.27.5 to 3.27.7 (apache#25186)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit ce4ebea)
(cherry picked from commit 2c3402e)

* [improve][misc] Upgrade snappy version to 1.1.10.8 (apache#25182)

(cherry picked from commit b15f53b)
(cherry picked from commit 304fea1)

* [fix][proxy] Close client connection immediately when credentials expire and forwardAuthorizationCredentials is disabled (apache#25179)

(cherry picked from commit 3348470)
(cherry picked from commit c06f8ba)

* [fix][client] ControlledClusterFailover avoid unnecessary reconnection. (apache#25178)

Co-authored-by: fengwenzhi <fengwenzhi.max@bigo.sg>
(cherry picked from commit f0ec07b)
(cherry picked from commit b41488d)

* [fix][sec] Bump org.apache.solr:solr-core from 9.8.0 to 9.10.1 in /pulsar-io/solr (apache#25175)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit a2f888a)
(cherry picked from commit b532068)

* [improve][pip] PIP-453: Improve the metadata store threading model (apache#25173)

(cherry picked from commit c51346f)
(cherry picked from commit d81d6b3)

* [improve][client]Reduce unnecessary getPartitionedTopicMetadata requests when using retry and DLQ topics. (apache#25172)

(cherry picked from commit 52a4d5e)
(cherry picked from commit 71a3994)

* [fix][misc] Allow JWT tokens in OpenID auth without nbf claim (apache#25197)

(cherry picked from commit d630394)
(cherry picked from commit 2760ee9)

* [fix][sec] Exclude org.lz4:lz4-java and standardize on at.yawk.lz4-java to remediate CVE-2025-12183 and CVE-2025-66566 (apache#25198)

(cherry picked from commit c07f2ad)
(cherry picked from commit 2ac6d03)

* fix checkstyle failure and license issues

* [fix] [test] Upgrade docker-java to 3.7.0 (apache#25209)

(cherry picked from commit 4add84c)
(cherry picked from commit 92b5d55)

* [fix][client] Fix race condition between isDuplicate() and flushAsync() method in PersistentAcknowledgmentsGroupingTracker due to incorrect use Netty Recycler (apache#25208)

(cherry picked from commit 5aab2f0)
(cherry picked from commit 2206949)

* [improve][monitor] Upgrade OpenTelemetry to 1.56.0, Otel instrumentation to 2.21.0 and Otel semconv to 1.37.0 (apache#24994)

(cherry picked from commit 53162ff)
(cherry picked from commit a1d5b6c)

* [improve][proxy] Add regression tests for package upload with 'Expect: 100-continue' (apache#25211)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit e8fedb1)
(cherry picked from commit 0947639)

* fix license issues

* [fix][test]Fix flaky ExtensibleLoadManagerImplTest_testGetMetrics (apache#25216)

(cherry picked from commit 257d42f)
(cherry picked from commit a8eac91)

* [fix][broker] Fix ManagedCursorImpl.asyncDelete() method may lose previous async mark delete properties in race condition (apache#25165)

(cherry picked from commit bea6f8a)
(cherry picked from commit 4332a44)

* [fix][broker]Fix ledgerHandle failed to read by using new BK API (apache#25199)

(cherry picked from commit 6d51f88)
(cherry picked from commit 1631fed)

* [fix][client] Fix producer synchronous retry handling in failPendingMessages method (apache#25207)

(cherry picked from commit 611efe4)
(cherry picked from commit 30ae8fb)

* [fix][broker] Prevent missed topic changes in topic watchers and schedule periodic refresh with patternAutoDiscoveryPeriod interval (apache#25188)

(cherry picked from commit 2e06cc0)
(cherry picked from commit ba2a230)

* fix for complilation error

* [feat][io] implement pip-297 for jdbc sinks (apache#25195)

(cherry picked from commit 6f4ac21)
(cherry picked from commit 998a4b1)

* [fix][broker] Fix httpProxyTimeout config (apache#25223)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit 2d6ef6f)
(cherry picked from commit 3b39c7b)

* [improve][broker] Add strictAuthMethod to require explicit authentication method (apache#25185)

Co-authored-by: Ómar K. Yasin <oyasin@apple.com>
(cherry picked from commit bae9173)
(cherry picked from commit 27e34f6)

* [feat][client] oauth2 trustcerts file and timeouts (apache#24944)

(cherry picked from commit b789d82)
(cherry picked from commit f8827bd)

* [improve][client] Make authorization server metadata path configurable in AuthenticationOAuth2 (apache#25052)

Co-authored-by: hoguni <hoguni@lycorp.co.jp>
(cherry picked from commit 3cb7a7b)
(cherry picked from commit 705a99d)

* Revert "[improve][broker] Add strictAuthMethod to require explicit authentication method (apache#25185)"

This reverts commit 531eb91.

* [improve][broker] Add idle timeout support for http (apache#25224)

(cherry picked from commit 63220ea)
(cherry picked from commit 144e064)

* [fix][broker] Fix incomplete futures in topic property update/delete methods (apache#25228)

(cherry picked from commit c2ae180)
(cherry picked from commit ab05ca2)

* [fix][test] Fix Mockito stubbing race in TopicListServiceTest (apache#25227)

(cherry picked from commit c93dd7a)
(cherry picked from commit 38a126b)

* [improve][broker] Give the detail error msg when authenticate failed with AuthenticationException (apache#25221)

(cherry picked from commit 0a0ce6d)
(cherry picked from commit 2a46c70)

* [fix][client] Send all chunkMessageIds to broker for redelivery (apache#25229)

(cherry picked from commit 0a0ce6d)
(cherry picked from commit f49c7b2)

* [fix][broker] Fix transactionMetadataFuture completeExceptionally with null value (apache#25231)

Co-authored-by: 张浩 <zhanghao60@100.me>
(cherry picked from commit 0e5d424)
(cherry picked from commit 42283f4)

* uncomment distribution management in pom

* Reapply "[improve][meta] PIP-453: Improve the metadata store threading model (apache#25187)"

This reverts commit a6aab86.

(cherry picked from commit 4f9b2ca)

* [improve] Upgrade Netty to 4.1.131.Final (apache#25232)

(cherry picked from commit db91b93)
(cherry picked from commit a6c602a)

* [fix][test] fix testBatchMetadataStoreMetrics. (apache#25241)

(cherry picked from commit 9db31cc)
(cherry picked from commit abbd478)

* [fix][test] Fix ResourceQuotaCalculatorImplTest#testNeedToReportLocalUsage (apache#25247)

(cherry picked from commit 48774de)
(cherry picked from commit 9343837)

* [fix][meta] Metadata cache refresh might not take effect (apache#25246)

(cherry picked from commit 24eba10)
(cherry picked from commit 6d81292)

* fix pulsar-proxy unit test case failure

* fix safe delete URLRegexLookupProxyHandler which is not used

* Revert "fix safe delete URLRegexLookupProxyHandler which is not used"

This reverts commit 158fc14.

* Revert "fix pulsar-proxy unit test case failure"

This reverts commit 4efcf70.

* updated hardcoded newLookupProxyHandler in ProxyService for failing URLRegexLookupProxyHandlerTest

* Revert "[improve][monitor] Upgrade OpenTelemetry to 1.56.0, Otel instrumentation to 2.21.0 and Otel semconv to 1.37.0 (apache#24994)"

This reverts commit 5e5328e

* reverted lincense for opentelemetry upgrade changes

* Revert "updated hardcoded newLookupProxyHandler in ProxyService for failing URLRegexLookupProxyHandlerTest"

This reverts commit a4f07dc.

* reverted mismatch commits changes in ProxyConnection.java

* fix code-style issue

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Kai Wang <kwang@apache.org>
Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com>
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zixuan Liu <nodeces@gmail.com>
Co-authored-by: Wenzhi Feng <thetumbled@apache.org>
Co-authored-by: fengwenzhi <fengwenzhi.max@bigo.sg>
Co-authored-by: Yunze Xu <xyzinfernity@163.com>
Co-authored-by: zhenJiangWang <zhenjiang427@gmail.com>
Co-authored-by: guptas6est <sanaya.gupta@est.tech>
Co-authored-by: Matteo Merli <mmerli@apache.org>
Co-authored-by: Oneby Wang <44369297+oneby-wang@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: fengyubiao <yubiao.feng@streamnative.io>
Co-authored-by: Malla Sandeep <sandeep.malla78@gmail.com>
Co-authored-by: Bäm <dev@sandchaschte.ch>
Co-authored-by: Omar Yasin <omarkj@icloud.com>
Co-authored-by: Ómar K. Yasin <oyasin@apple.com>
Co-authored-by: gulecroc <gu.lecroc@gmail.com>
Co-authored-by: Hideaki Oguni <22386882+izumo27@users.noreply.github.com>
Co-authored-by: hoguni <hoguni@lycorp.co.jp>
Co-authored-by: Cong Zhao <zhaocong@apache.org>
Co-authored-by: sinan liu <liusinan1998@gmail.com>
Co-authored-by: Jiwei Guo <technoboy@apache.org>
Co-authored-by: cai minjian <905767378@qq.com>
Co-authored-by: Hao Zhang <zhanghao1@cmss.chinamobile.com>
Co-authored-by: 张浩 <zhanghao60@100.me>
Co-authored-by: Lari Hotari <lhotari@apache.org>
Co-authored-by: zzb <48124861+zhaizhibo@users.noreply.github.com>
priyanshu-ctds pushed a commit to datastax/pulsar that referenced this pull request Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. PIP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants