-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Optimize DefaultThreadContextMap.getCopy() performance #3939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2.x
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jengebr, @anuragdy, thanks so much for the report and the fix. I see that you provided a benchmark in #3935. Curious: Have you checked the existing log4j-perf-test
benchmarks on thread context map? Are you able to reproduce the improvement figures using one of the benchmarks there? If so, which one(s)? If not, would you mind updating existing MDC benchmarks accordingly, please?
// Manual iteration avoids megamorphic virtual calls that prevent JIT optimization. | ||
// The HashMap(Map) constructor requires (3 + 4n) virtual method calls that become | ||
// megamorphic when used with different map types, leading to 24-136% performance | ||
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize. | |
// degradation. Manual iteration creates monomorphic call sites that JIT can optimize. | |
// See https://bugs.openjdk.org/browse/JDK-8368292 |
* Returns a mutable copy of the current thread context map. | ||
* <p> | ||
* This method has been optimized to avoid performance issues with the HashMap(Map) constructor | ||
* that suffers from megamorphic call overhead (see JDK-8368292 and GitHub issue #3935). | ||
* The optimization provides 30-50% performance improvement for non-empty maps by using | ||
* manual iteration instead of the HashMap constructor. | ||
* </p> | ||
* <p> | ||
* Benchmark results show significant improvements: | ||
* <ul> | ||
* <li>Map size 5: 37% faster (90.9ns → 57.5ns)</li> | ||
* <li>Map size 75: 47% faster (1248ns → 667ns)</li> | ||
* <li>Map size 1000: 39% faster (18625ns → 11452ns)</li> | ||
* </ul> | ||
* </p> | ||
* | ||
* @return a mutable copy of the current thread context map, never {@code null} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Returns a mutable copy of the current thread context map. | |
* <p> | |
* This method has been optimized to avoid performance issues with the HashMap(Map) constructor | |
* that suffers from megamorphic call overhead (see JDK-8368292 and GitHub issue #3935). | |
* The optimization provides 30-50% performance improvement for non-empty maps by using | |
* manual iteration instead of the HashMap constructor. | |
* </p> | |
* <p> | |
* Benchmark results show significant improvements: | |
* <ul> | |
* <li>Map size 5: 37% faster (90.9ns → 57.5ns)</li> | |
* <li>Map size 75: 47% faster (1248ns → 667ns)</li> | |
* <li>Map size 1000: 39% faster (18625ns → 11452ns)</li> | |
* </ul> | |
* </p> | |
* | |
* @return a mutable copy of the current thread context map, never {@code null} | |
* {@return a mutable copy of the current thread context map} |
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/> | ||
<description format="asciidoc"> | ||
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in HashMap constructor. | ||
|
||
The optimization replaces `new HashMap<>(map)` with manual iteration to avoid JDK-8368292 performance issues. | ||
This provides significant performance improvements: | ||
|
||
* 37% faster for maps with 5 elements (90.9ns → 57.5ns) | ||
* 47% faster for maps with 75 elements (1248ns → 667ns) | ||
* 39% faster for maps with 1000 elements (18625ns → 11452ns) | ||
|
||
The change maintains identical functional behavior while eliminating megamorphic virtual calls that prevent JIT optimization. | ||
This optimization particularly benefits applications with heavy ThreadContext usage such as web applications and microservices. | ||
</description> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/> | |
<description format="asciidoc"> | |
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in HashMap constructor. | |
The optimization replaces `new HashMap<>(map)` with manual iteration to avoid JDK-8368292 performance issues. | |
This provides significant performance improvements: | |
* 37% faster for maps with 5 elements (90.9ns → 57.5ns) | |
* 47% faster for maps with 75 elements (1248ns → 667ns) | |
* 39% faster for maps with 1000 elements (18625ns → 11452ns) | |
The change maintains identical functional behavior while eliminating megamorphic virtual calls that prevent JIT optimization. | |
This optimization particularly benefits applications with heavy ThreadContext usage such as web applications and microservices. | |
</description> | |
<issue id="3935" link="https://github.com/apache/logging-log4j2/issues/3935"/> | |
<issue id="3939" link="https://github.com/apache/logging-log4j2/pull/3939"/> | |
<description format="asciidoc"> | |
Optimize `DefaultThreadContextMap.getCopy()` performance by avoiding megamorphic calls in `HashMap` constructor | |
</description> |
/** | ||
* Test getCopy() with empty map | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Test getCopy() with empty map | |
*/ |
/** | ||
* Test getCopy() with single-element map | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Test getCopy() with single-element map | |
*/ |
for (Map.Entry<String, String> entry : testData.entrySet()) { | ||
assertTrue(copy.containsKey(entry.getKey())); | ||
assertEquals(entry.getValue(), copy.get(entry.getKey())); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this duplicate assertEquals(testData, copy)
?
for (Map.Entry<String, String> entry : testData.entrySet()) { | |
assertTrue(copy.containsKey(entry.getKey())); | |
assertEquals(entry.getValue(), copy.get(entry.getKey())); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - you're right. The individual entry checks are redundant since assertEquals(testData, copy)
already verifies the maps are equal. I'll remove the loop and keep the single assertion.
/** | ||
* Test getCopy() returns proper HashMap type | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Test getCopy() returns proper HashMap type | |
*/ |
/** | ||
* Test getCopy() independence from original map | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** | |
* Test getCopy() independence from original map | |
*/ |
This PR optimizes the
DefaultThreadContextMap.getCopy()
method to address performance issues related to megamorphic calls in the HashMap constructor, providing 30-50% performance improvements for non-empty maps.The original implementation used
new HashMap<>(map)
which suffers from megamorphic call performance issues documented in:The HashMap constructor with a Map parameter requires (3 + 4n) virtual method calls that become megamorphic when used with different map types, leading to 24-136% performance degradation compared to manual iteration.
Closes #3935.