Skip to content

Conversation

anuragdy
Copy link

@anuragdy anuragdy commented Sep 25, 2025

This PR optimizes the DefaultThreadContextMap.getCopy() method to address performance issues related to megamorphic calls in the HashMap constructor, providing 30-50% performance improvements for non-empty maps.

The original implementation used new HashMap<>(map) which suffers from megamorphic call performance issues documented in:

The HashMap constructor with a Map parameter requires (3 + 4n) virtual method calls that become megamorphic when used with different map types, leading to 24-136% performance degradation compared to manual iteration.

Closes #3935.

@anuragdy anuragdy marked this pull request as ready for review September 25, 2025 04:59
@vy vy self-assigned this Sep 25, 2025
@vy vy added api Affects the public API performance Issues or PRs that affect performance, throughput, latency, etc. labels Sep 25, 2025
Copy link
Member

@vy vy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jengebr, @anuragdy, thanks so much for the report and the fix. I see that you provided a benchmark in #3935. Curious: Have you checked the existing log4j-perf-test benchmarks on thread context map? Are you able to reproduce the improvement figures using one of the benchmarks there? If so, which one(s)? If not, would you mind updating existing MDC benchmarks accordingly, please?

// Manual iteration avoids megamorphic virtual calls that prevent JIT optimization.
// The HashMap(Map) constructor requires (3 + 4n) virtual method calls that become
// megamorphic when used with different map types, leading to 24-136% performance
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize.
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize.
// See https://bugs.openjdk.org/browse/JDK-8368292

Comment on lines +151 to +167
* Returns a mutable copy of the current thread context map.
* <p>
* This method has been optimized to avoid performance issues with the HashMap(Map) constructor
* that suffers from megamorphic call overhead (see JDK-8368292 and GitHub issue #3935).
* The optimization provides 30-50% performance improvement for non-empty maps by using
* manual iteration instead of the HashMap constructor.
* </p>
* <p>
* Benchmark results show significant improvements:
* <ul>
* <li>Map size 5: 37% faster (90.9ns → 57.5ns)</li>
* <li>Map size 75: 47% faster (1248ns → 667ns)</li>
* <li>Map size 1000: 39% faster (18625ns → 11452ns)</li>
* </ul>
* </p>
*
* @return a mutable copy of the current thread context map, never {@code null}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Returns a mutable copy of the current thread context map.
* <p>
* This method has been optimized to avoid performance issues with the HashMap(Map) constructor
* that suffers from megamorphic call overhead (see JDK-8368292 and GitHub issue #3935).
* The optimization provides 30-50% performance improvement for non-empty maps by using
* manual iteration instead of the HashMap constructor.
* </p>
* <p>
* Benchmark results show significant improvements:
* <ul>
* <li>Map size 5: 37% faster (90.9ns57.5ns)</li>
* <li>Map size 75: 47% faster (1248ns667ns)</li>
* <li>Map size 1000: 39% faster (18625ns11452ns)</li>
* </ul>
* </p>
*
* @return a mutable copy of the current thread context map, never {@code null}
* {@return a mutable copy of the current thread context map}

Comment on lines +8 to +21
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/>
<description format="asciidoc">
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in HashMap constructor.

The optimization replaces `new HashMap&lt;>(map)` with manual iteration to avoid JDK-8368292 performance issues.
This provides significant performance improvements:

* 37% faster for maps with 5 elements (90.9ns → 57.5ns)
* 47% faster for maps with 75 elements (1248ns → 667ns)
* 39% faster for maps with 1000 elements (18625ns → 11452ns)

The change maintains identical functional behavior while eliminating megamorphic virtual calls that prevent JIT optimization.
This optimization particularly benefits applications with heavy ThreadContext usage such as web applications and microservices.
</description>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/>
<description format="asciidoc">
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in HashMap constructor.
The optimization replaces `new HashMap&lt;>(map)` with manual iteration to avoid JDK-8368292 performance issues.
This provides significant performance improvements:
* 37% faster for maps with 5 elements (90.9ns → 57.5ns)
* 47% faster for maps with 75 elements (1248ns → 667ns)
* 39% faster for maps with 1000 elements (18625ns → 11452ns)
The change maintains identical functional behavior while eliminating megamorphic virtual calls that prevent JIT optimization.
This optimization particularly benefits applications with heavy ThreadContext usage such as web applications and microservices.
</description>
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/>
<issue id="3939" link="https://github.com/apache/logging-log4j2/pull/3939"/>
<description format="asciidoc">
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in `HashMap` constructor
</description>

Comment on lines +136 to +138
/**
* Test getCopy() with empty map
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* Test getCopy() with empty map
*/

Comment on lines +160 to +162
/**
* Test getCopy() with single-element map
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* Test getCopy() with single-element map
*/

Comment on lines +213 to +217
for (Map.Entry<String, String> entry : testData.entrySet()) {
assertTrue(copy.containsKey(entry.getKey()));
assertEquals(entry.getValue(), copy.get(entry.getKey()));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this duplicate assertEquals(testData, copy)?

Suggested change
for (Map.Entry<String, String> entry : testData.entrySet()) {
assertTrue(copy.containsKey(entry.getKey()));
assertEquals(entry.getValue(), copy.get(entry.getKey()));
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - you're right. The individual entry checks are redundant since assertEquals(testData, copy) already verifies the maps are equal. I'll remove the loop and keep the single assertion.

Comment on lines +226 to +228
/**
* Test getCopy() returns proper HashMap type
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* Test getCopy() returns proper HashMap type
*/

Comment on lines +243 to +245
/**
* Test getCopy() independence from original map
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* Test getCopy() independence from original map
*/

@vy vy modified the milestone: 2.26.0 Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Affects the public API performance Issues or PRs that affect performance, throughput, latency, etc.
Projects
Status: To triage
Development

Successfully merging this pull request may close these issues.

Performance improvement to DefaultThreadContextMap
2 participants