Skip to content

Conversation

@dougqh
Copy link
Contributor

@dougqh dougqh commented Aug 28, 2025

What Does This Do

This change adds UTF-8 encoding caching to optimize v0.4 payload construction. (Support for v0.5 will likely be done separately, since it requires more significant changes.)

Since String#getBytes is already highly optimized, so these caches actually perform worse throughput-wise than an uncached encoding. However, the caches are useful in reducing allocation from UTF-8 encoding providing more headroom for the host application - which allows for improvements in throughput of the host application.

For tags, a "simple" cache is used. The simple cache is a single level cache -- that uses hashing combined with linear probing. To avoid, cache churn and unnecessary allocation of a CacheEntry, the simple cache uses a first request marking scheme that typically avoids creating a CacheEntry for values that are requested only once. Eviction from the simple cache is done based on LFU policy.

For tag values, a more complicated generational cache is used. The generational cache combines the delayed CacheEntry creation logic of the simple cache with a 2nd-level for resilience.

Frequently used entries are "promoted" to the higher level cache. The 1st level of the generational cache uses a LFU eviction policy. The 2nd level of the generational cache uses a LRU eviction policy.

For the tag value use case, the generational policy provides a 2x increase in hit rate over the simple cache.

Motivation

This change reduces the memory allocation overhead caused by UTF-8 encoding.
In systems with ample heap, this change provides a 25-75% improvement in throughput reduction by reducing GC cycles.
In systems with limited heap, this change is net neutral.

dougqh added 2 commits August 28, 2025 13:05
This change adds UTF-8 encoding caching to optimize v0.4 payload construction.

Since String#getBytes is intrinsified these caches actually perform worse throughput wise than an uncached conversion.  However, the caches are useful in reducing allocation from UTF-8 conversions.

For tags, a "simple" cache is used.  The simple cache is a single level cache -- that uses hashing combined with linear probing.  To avoid, cache churn and unnecessary allocation of a CacheEntry, the simple cache uses a first request marking scheme that typically avoids creating a CacheEntry for values that are requested only once.  Eviction from the "simple" cache is done based on LFU policy.

For tag values, a more complicated generational cache is used.  The generational cache combines the delayed CacheEntry logic of the simple cache with a 2nd-level for resilience.

Frequently used entries are "promoted" to the higher level cache.  The 1st level of the generational cache uses a LFU eviction policy.  The 2nd of the generational cache uses a LRU eviction policy.

For the value use cache, the generational policy provided 2x increase in hit rate over the simple cache.
@dougqh dougqh requested a review from a team as a code owner August 28, 2025 17:09
@dougqh dougqh requested a review from ygree August 28, 2025 17:09
@dougqh dougqh added the tag: performance Performance related changes label Aug 28, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 28, 2025

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@dougqh dougqh added the type: enhancement Enhancements and improvements label Aug 28, 2025
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Aug 28, 2025

🎯 Code Coverage
Patch Coverage: 76.85%
Total Coverage: 57.65% (-0.06%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c923194 | Docs | Was this helpful? Give us feedback!

@dougqh dougqh added the comp: platform Platform label Aug 28, 2025
@pr-commenter
Copy link

pr-commenter bot commented Aug 28, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/utf8-caching
git_commit_date 1757081570 1757081757
git_commit_sha cb08250 c923194
release_version 1.54.0-SNAPSHOT~cb08250ba1 1.53.0-SNAPSHOT~c923194dea
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1757083505 1757083505
ci_job_id 1115824912 1115824912
ci_pipeline_id 75663079 75663079
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-3-ey9gmeag 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-3-ey9gmeag 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 48 metrics, 11 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.049 s) : 0, 1049365
Total [baseline] (8.64 s) : 0, 8640041
Agent [candidate] (1.049 s) : 0, 1049146
Total [candidate] (8.622 s) : 0, 8622132
section iast
Agent [baseline] (1.181 s) : 0, 1180887
Total [baseline] (9.376 s) : 0, 9376388
Agent [candidate] (1.179 s) : 0, 1179162
Total [candidate] (9.345 s) : 0, 9344784
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.049 s -
Agent iast 1.181 s 131.522 ms (12.5%)
Total tracing 8.64 s -
Total iast 9.376 s 736.347 ms (8.5%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.049 s -
Agent iast 1.179 s 130.016 ms (12.4%)
Total tracing 8.622 s -
Total iast 9.345 s 722.652 ms (8.4%)
gantt
    title insecure-bank - break down per module: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.455 ms) : 0, 1455
BytebuddyAgent [baseline] (733.392 ms) : 0, 733392
BytebuddyAgent [candidate] (733.688 ms) : 0, 733688
GlobalTracer [baseline] (242.546 ms) : 0, 242546
GlobalTracer [candidate] (242.79 ms) : 0, 242790
AppSec [baseline] (30.168 ms) : 0, 30168
AppSec [candidate] (30.342 ms) : 0, 30342
Debugger [baseline] (6.12 ms) : 0, 6120
Debugger [candidate] (6.115 ms) : 0, 6115
Remote Config [baseline] (693.257 µs) : 0, 693
Remote Config [candidate] (683.187 µs) : 0, 683
Telemetry [baseline] (13.822 ms) : 0, 13822
Telemetry [candidate] (12.943 ms) : 0, 12943
section iast
crashtracking [baseline] (1.468 ms) : 0, 1468
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (852.548 ms) : 0, 852548
BytebuddyAgent [candidate] (851.452 ms) : 0, 851452
GlobalTracer [baseline] (233.84 ms) : 0, 233840
GlobalTracer [candidate] (232.649 ms) : 0, 232649
AppSec [baseline] (26.277 ms) : 0, 26277
AppSec [candidate] (28.41 ms) : 0, 28410
Debugger [baseline] (7.589 ms) : 0, 7589
Debugger [candidate] (5.833 ms) : 0, 5833
Remote Config [baseline] (606.262 µs) : 0, 606
Remote Config [candidate] (601.351 µs) : 0, 601
Telemetry [baseline] (9.034 ms) : 0, 9034
Telemetry [candidate] (8.31 ms) : 0, 8310
IAST [baseline] (28.455 ms) : 0, 28455
IAST [candidate] (29.443 ms) : 0, 29443
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.053 s) : 0, 1052555
Total [baseline] (10.66 s) : 0, 10659586
Agent [candidate] (1.055 s) : 0, 1054881
Total [candidate] (10.779 s) : 0, 10779448
section appsec
Agent [baseline] (1.224 s) : 0, 1223817
Total [baseline] (10.825 s) : 0, 10824728
Agent [candidate] (1.226 s) : 0, 1225537
Total [candidate] (10.835 s) : 0, 10834796
section iast
Agent [baseline] (1.18 s) : 0, 1179517
Total [baseline] (10.926 s) : 0, 10925848
Agent [candidate] (1.197 s) : 0, 1196517
Total [candidate] (10.956 s) : 0, 10956493
section profiling
Agent [baseline] (1.2 s) : 0, 1200377
Total [baseline] (10.92 s) : 0, 10920058
Agent [candidate] (1.197 s) : 0, 1197289
Total [candidate] (10.957 s) : 0, 10956898
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.053 s -
Agent appsec 1.224 s 171.262 ms (16.3%)
Agent iast 1.18 s 126.962 ms (12.1%)
Agent profiling 1.2 s 147.822 ms (14.0%)
Total tracing 10.66 s -
Total appsec 10.825 s 165.142 ms (1.5%)
Total iast 10.926 s 266.262 ms (2.5%)
Total profiling 10.92 s 260.472 ms (2.4%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.055 s -
Agent appsec 1.226 s 170.656 ms (16.2%)
Agent iast 1.197 s 141.636 ms (13.4%)
Agent profiling 1.197 s 142.408 ms (13.5%)
Total tracing 10.779 s -
Total appsec 10.835 s 55.348 ms (0.5%)
Total iast 10.956 s 177.045 ms (1.6%)
Total profiling 10.957 s 177.45 ms (1.6%)
gantt
    title petclinic - break down per module: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.466 ms) : 0, 1466
BytebuddyAgent [baseline] (736.799 ms) : 0, 736799
BytebuddyAgent [candidate] (736.685 ms) : 0, 736685
GlobalTracer [baseline] (243.911 ms) : 0, 243911
GlobalTracer [candidate] (243.963 ms) : 0, 243963
AppSec [baseline] (30.513 ms) : 0, 30513
AppSec [candidate] (30.2 ms) : 0, 30200
Debugger [baseline] (6.091 ms) : 0, 6091
Debugger [candidate] (6.099 ms) : 0, 6099
Remote Config [baseline] (698.916 µs) : 0, 699
Remote Config [candidate] (684.464 µs) : 0, 684
Telemetry [baseline] (11.829 ms) : 0, 11829
Telemetry [candidate] (14.567 ms) : 0, 14567
section appsec
crashtracking [baseline] (1.478 ms) : 0, 1478
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (755.039 ms) : 0, 755039
BytebuddyAgent [candidate] (756.641 ms) : 0, 756641
GlobalTracer [baseline] (235.291 ms) : 0, 235291
GlobalTracer [candidate] (235.871 ms) : 0, 235871
AppSec [baseline] (168.4 ms) : 0, 168400
AppSec [candidate] (170.346 ms) : 0, 170346
Debugger [baseline] (8.913 ms) : 0, 8913
Debugger [candidate] (5.788 ms) : 0, 5788
Remote Config [baseline] (627.226 µs) : 0, 627
Remote Config [candidate] (620.475 µs) : 0, 620
Telemetry [baseline] (9.29 ms) : 0, 9290
Telemetry [candidate] (10.031 ms) : 0, 10031
IAST [baseline] (23.645 ms) : 0, 23645
IAST [candidate] (23.588 ms) : 0, 23588
section iast
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.478 ms) : 0, 1478
BytebuddyAgent [baseline] (851.0 ms) : 0, 851000
BytebuddyAgent [candidate] (865.571 ms) : 0, 865571
GlobalTracer [baseline] (233.08 ms) : 0, 233080
GlobalTracer [candidate] (235.541 ms) : 0, 235541
AppSec [baseline] (26.27 ms) : 0, 26270
AppSec [candidate] (27.605 ms) : 0, 27605
Debugger [baseline] (6.671 ms) : 0, 6671
Debugger [candidate] (6.67 ms) : 0, 6670
Remote Config [baseline] (608.66 µs) : 0, 609
Remote Config [candidate] (599.662 µs) : 0, 600
Telemetry [baseline] (8.345 ms) : 0, 8345
Telemetry [candidate] (8.351 ms) : 0, 8351
IAST [baseline] (30.96 ms) : 0, 30960
IAST [candidate] (29.387 ms) : 0, 29387
section profiling
crashtracking [baseline] (1.434 ms) : 0, 1434
crashtracking [candidate] (1.431 ms) : 0, 1431
BytebuddyAgent [baseline] (762.988 ms) : 0, 762988
BytebuddyAgent [candidate] (761.431 ms) : 0, 761431
GlobalTracer [baseline] (223.142 ms) : 0, 223142
GlobalTracer [candidate] (222.721 ms) : 0, 222721
AppSec [baseline] (30.765 ms) : 0, 30765
AppSec [candidate] (30.388 ms) : 0, 30388
Debugger [baseline] (6.232 ms) : 0, 6232
Debugger [candidate] (6.231 ms) : 0, 6231
Remote Config [baseline] (732.382 µs) : 0, 732
Remote Config [candidate] (690.823 µs) : 0, 691
Telemetry [baseline] (16.553 ms) : 0, 16553
Telemetry [candidate] (16.311 ms) : 0, 16311
ProfilingAgent [baseline] (108.002 ms) : 0, 108002
ProfilingAgent [candidate] (107.603 ms) : 0, 107603
Profiling [baseline] (108.665 ms) : 0, 108665
Profiling [candidate] (108.253 ms) : 0, 108253
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/utf8-caching
git_commit_date 1757081570 1757081757
git_commit_sha cb08250 c923194
release_version 1.54.0-SNAPSHOT~cb08250ba1 1.53.0-SNAPSHOT~c923194dea
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1757083179 1757083179
ci_job_id 1115824913 1115824913
ci_pipeline_id 75663079 75663079
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-4-h9vumr0w 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-4-h9vumr0w 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 1 performance regressions! Performance is the same for 9 metrics, 12 unstable metrics.

scenario Δ mean http_req_duration Δ mean throughput candidate mean http_req_duration candidate mean throughput baseline mean http_req_duration baseline mean throughput
scenario:load:insecure-bank:no_agent:high_load worse
[+291.436µs; +401.732µs] or [+7.070%; +9.746%]
unstable
[-199.874op/s; +31.124op/s] or [-17.978%; +2.800%]
4.469ms 1027.406op/s 4.122ms 1111.781op/s
scenario:load:petclinic:appsec:high_load better
[-2.049ms; -1.145ms] or [-4.234%; -2.367%]
unstable
[-3.597op/s; +10.172op/s] or [-3.719%; +10.518%]
46.792ms 100.000op/s 48.389ms 96.713op/s
scenario:load:petclinic:profiling:high_load better
[-1.927ms; -0.982ms] or [-3.988%; -2.032%]
unstable
[-4.060op/s; +10.085op/s] or [-4.193%; +10.415%]
46.867ms 99.850op/s 48.322ms 96.838op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (37.777 ms) : 37477, 38076
.   : milestone, 37777,
appsec (48.389 ms) : 47967, 48811
.   : milestone, 48389,
code_origins (44.699 ms) : 44316, 45081
.   : milestone, 44699,
iast (44.868 ms) : 44478, 45259
.   : milestone, 44868,
profiling (48.322 ms) : 47868, 48775
.   : milestone, 48322,
tracing (43.549 ms) : 43192, 43905
.   : milestone, 43549,
section candidate
no_agent (36.958 ms) : 36654, 37262
.   : milestone, 36958,
appsec (46.792 ms) : 46374, 47209
.   : milestone, 46792,
code_origins (45.682 ms) : 45273, 46091
.   : milestone, 45682,
iast (44.765 ms) : 44382, 45148
.   : milestone, 44765,
profiling (46.867 ms) : 46443, 47291
.   : milestone, 46867,
tracing (44.16 ms) : 43793, 44527
.   : milestone, 44160,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.777 ms [37.477 ms, 38.076 ms] -
appsec 48.389 ms [47.967 ms, 48.811 ms] 10.612 ms (28.1%)
code_origins 44.699 ms [44.316 ms, 45.081 ms] 6.922 ms (18.3%)
iast 44.868 ms [44.478 ms, 45.259 ms] 7.092 ms (18.8%)
profiling 48.322 ms [47.868 ms, 48.775 ms] 10.545 ms (27.9%)
tracing 43.549 ms [43.192 ms, 43.905 ms] 5.772 ms (15.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 36.958 ms [36.654 ms, 37.262 ms] -
appsec 46.792 ms [46.374 ms, 47.209 ms] 9.834 ms (26.6%)
code_origins 45.682 ms [45.273 ms, 46.091 ms] 8.724 ms (23.6%)
iast 44.765 ms [44.382 ms, 45.148 ms] 7.807 ms (21.1%)
profiling 46.867 ms [46.443 ms, 47.291 ms] 9.909 ms (26.8%)
tracing 44.16 ms [43.793 ms, 44.527 ms] 7.202 ms (19.5%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.122 ms) : 4071, 4173
.   : milestone, 4122,
iast (9.243 ms) : 9087, 9399
.   : milestone, 9243,
iast_FULL (13.887 ms) : 13614, 14161
.   : milestone, 13887,
iast_GLOBAL (10.557 ms) : 10371, 10744
.   : milestone, 10557,
profiling (8.769 ms) : 8628, 8911
.   : milestone, 8769,
tracing (7.515 ms) : 7407, 7623
.   : milestone, 7515,
section candidate
no_agent (4.469 ms) : 4417, 4521
.   : milestone, 4469,
iast (9.578 ms) : 9416, 9741
.   : milestone, 9578,
iast_FULL (14.351 ms) : 14064, 14638
.   : milestone, 14351,
iast_GLOBAL (10.321 ms) : 10132, 10510
.   : milestone, 10321,
profiling (8.893 ms) : 8747, 9038
.   : milestone, 8893,
tracing (7.672 ms) : 7562, 7781
.   : milestone, 7672,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.122 ms [4.071 ms, 4.173 ms] -
iast 9.243 ms [9.087 ms, 9.399 ms] 5.121 ms (124.2%)
iast_FULL 13.887 ms [13.614 ms, 14.161 ms] 9.765 ms (236.9%)
iast_GLOBAL 10.557 ms [10.371 ms, 10.744 ms] 6.435 ms (156.1%)
profiling 8.769 ms [8.628 ms, 8.911 ms] 4.647 ms (112.7%)
tracing 7.515 ms [7.407 ms, 7.623 ms] 3.393 ms (82.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.469 ms [4.417 ms, 4.521 ms] -
iast 9.578 ms [9.416 ms, 9.741 ms] 5.11 ms (114.3%)
iast_FULL 14.351 ms [14.064 ms, 14.638 ms] 9.883 ms (221.1%)
iast_GLOBAL 10.321 ms [10.132 ms, 10.51 ms] 5.852 ms (131.0%)
profiling 8.893 ms [8.747 ms, 9.038 ms] 4.424 ms (99.0%)
tracing 7.672 ms [7.562 ms, 7.781 ms] 3.203 ms (71.7%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master dougqh/utf8-caching
git_commit_date 1757081570 1757081757
git_commit_sha cb08250 c923194
release_version 1.54.0-SNAPSHOT~cb08250ba1 1.53.0-SNAPSHOT~c923194dea
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1757083706 1757083706
ci_job_id 1115824914 1115824914
ci_pipeline_id 75663079 75663079
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-rjlz43am 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-rjlz43am 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.473 ms) : 1461, 1484
.   : milestone, 1473,
appsec (3.621 ms) : 3406, 3836
.   : milestone, 3621,
iast (2.2 ms) : 2137, 2264
.   : milestone, 2200,
iast_GLOBAL (2.242 ms) : 2178, 2306
.   : milestone, 2242,
profiling (2.054 ms) : 2003, 2106
.   : milestone, 2054,
tracing (2.037 ms) : 1987, 2087
.   : milestone, 2037,
section candidate
no_agent (1.472 ms) : 1460, 1484
.   : milestone, 1472,
appsec (3.608 ms) : 3395, 3821
.   : milestone, 3608,
iast (2.202 ms) : 2139, 2266
.   : milestone, 2202,
iast_GLOBAL (2.246 ms) : 2182, 2310
.   : milestone, 2246,
profiling (2.066 ms) : 2013, 2119
.   : milestone, 2066,
tracing (2.016 ms) : 1966, 2065
.   : milestone, 2016,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.473 ms [1.461 ms, 1.484 ms] -
appsec 3.621 ms [3.406 ms, 3.836 ms] 2.148 ms (145.9%)
iast 2.2 ms [2.137 ms, 2.264 ms] 727.472 µs (49.4%)
iast_GLOBAL 2.242 ms [2.178 ms, 2.306 ms] 768.981 µs (52.2%)
profiling 2.054 ms [2.003 ms, 2.106 ms] 581.549 µs (39.5%)
tracing 2.037 ms [1.987 ms, 2.087 ms] 563.985 µs (38.3%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.46 ms, 1.484 ms] -
appsec 3.608 ms [3.395 ms, 3.821 ms] 2.136 ms (145.1%)
iast 2.202 ms [2.139 ms, 2.266 ms] 730.271 µs (49.6%)
iast_GLOBAL 2.246 ms [2.182 ms, 2.31 ms] 774.044 µs (52.6%)
profiling 2.066 ms [2.013 ms, 2.119 ms] 594.101 µs (40.4%)
tracing 2.016 ms [1.966 ms, 2.065 ms] 543.417 µs (36.9%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.95 s) : 14950000, 14950000
.   : milestone, 14950000,
appsec (14.913 s) : 14913000, 14913000
.   : milestone, 14913000,
iast (18.504 s) : 18504000, 18504000
.   : milestone, 18504000,
iast_GLOBAL (17.744 s) : 17744000, 17744000
.   : milestone, 17744000,
profiling (15.537 s) : 15537000, 15537000
.   : milestone, 15537000,
tracing (14.887 s) : 14887000, 14887000
.   : milestone, 14887000,
section candidate
no_agent (15.543 s) : 15543000, 15543000
.   : milestone, 15543000,
appsec (14.868 s) : 14868000, 14868000
.   : milestone, 14868000,
iast (18.8 s) : 18800000, 18800000
.   : milestone, 18800000,
iast_GLOBAL (17.985 s) : 17985000, 17985000
.   : milestone, 17985000,
profiling (15.896 s) : 15896000, 15896000
.   : milestone, 15896000,
tracing (14.774 s) : 14774000, 14774000
.   : milestone, 14774000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.95 s [14.95 s, 14.95 s] -
appsec 14.913 s [14.913 s, 14.913 s] -37.0 ms (-0.2%)
iast 18.504 s [18.504 s, 18.504 s] 3.554 s (23.8%)
iast_GLOBAL 17.744 s [17.744 s, 17.744 s] 2.794 s (18.7%)
profiling 15.537 s [15.537 s, 15.537 s] 587.0 ms (3.9%)
tracing 14.887 s [14.887 s, 14.887 s] -63.0 ms (-0.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.543 s [15.543 s, 15.543 s] -
appsec 14.868 s [14.868 s, 14.868 s] -675.0 ms (-4.3%)
iast 18.8 s [18.8 s, 18.8 s] 3.257 s (21.0%)
iast_GLOBAL 17.985 s [17.985 s, 17.985 s] 2.442 s (15.7%)
profiling 15.896 s [15.896 s, 15.896 s] 353.0 ms (2.3%)
tracing 14.774 s [14.774 s, 14.774 s] -769.0 ms (-4.9%)

@PerfectSlayer PerfectSlayer added comp: core Tracer core and removed comp: platform Platform labels Aug 29, 2025
Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't check at the cache implementation but posting some quick feedback first:

🎯 suggestion: ‏I don't have the full context but should we apply it to the dictionary mapper for V05 too?

dougqh added 6 commits August 29, 2025 10:29
- implementing review feedback
- experimenting with exact hash based marking scheme
- fixed issue with not updating entry after hit in simple cache
- re-enabling cache by default for benchmarking
- spotless
- altered marking strategy to use a bloom filter of previously requested values, once a new entry hits the filter the filter is reset to zero
- tweaking cache sizes
@dougqh
Copy link
Contributor Author

dougqh commented Sep 2, 2025

I haven't check at the cache implementation but posting some quick feedback first:

🎯 suggestion: ‏I don't have the full context but should we apply it to the dictionary mapper for V05 too?

Yes, I think we should. To do that, I'm going to have to make some bigger changes to v0.5, so I might leave that for another PR.

@bantonsson
Copy link
Contributor

bantonsson commented Sep 2, 2025

Just a question out of the blue. Couldn't the simple cache reuse the FixedSizeCache with UTF8ByteString?

Yeah, I think that's a possibility. I honestly haven't quite determined if the protection against eagerly creating CacheEntry-s is essential for tag names.

I'd started with a different approach where I generated the UTF8 representations of the known tag names first, but I shelved that because it is a bit hard to incorporate / maintain.

- clean-up based on review feedback
- making naming consistent - some vestiges of prior names for second level cache updated
- tweaked generational cache to check tenured entries first
-
- switching generational cache to use different probe lengths for eden vs tenured generation
- these settings are neutral or better throughput wise for petclinic for 64m, 80m, 96m, and 128m heaps
Copy link
Contributor

@bric3 bric3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First wave of comments, I haven't looked at GenerationalUtf8Cache yet

public final class SimpleUtf8Cache implements EncodingCache {
private static final int MAX_PROBES = 4;

private final int SIZE = 128;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why 128 in particular ? And not 256 ?

IIC this needs to be a power of two to be used as a bitmask for the modulo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, power 2 so that the bitmask calculations works for the bucket calculation.

As for why 128? I mostly just played with cache sizes, probe lengths, etc to come up with something that was good at multiple heap sizes. These values gave nice gains in throughput at higher heap sizes and were neutral on throughput at lower heap sizes.

Or to put it more succinctly, I usually aim to make the cache as small as I can without compromising the hit rate. Admittedly, there is a danger of overfitting to the benchmark load.

Copy link
Contributor

@bric3 bric3 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might be worth adding a comment pointing to where this modulo happens (initialBucketIndex methods)

Comment on lines +107 to +108
String tag = nextTag();
String value = nextValue(tag);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Out of curiosity, should it it be better to generate a tag / value dataset outside the the benchmark methods ? Maybe this could allow to have datasets with wider range of values.


I believe some customers have wide chars values (e.g. in korean) in their tag, would it be useful to have a benchmark for that, could the gains be more pronounced in this case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. I need to experiment some more to figure out what's possible with JMH.
The x_baseline methods exist, so that I can do a comparison to the "same" logic without the encoding.

Comment on lines +225 to +226
newEntry.hit(lookupTimeMs);
newEntry.hit(lookupTimeMs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Won't it amplify hits anytime this entry is "accessed" after the first use (which is a mark) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in a sense, this provides the incorrect first access time, but only the last access time is stored.
Also, I'm not precisely tracking time because I don't want to constantly call System#currentTimeMills.
Instead I just update the access time once each time a payload is being constructed, so times are somewhat deliberately imprecise.

public final class SimpleUtf8Cache implements EncodingCache {
private static final int MAX_PROBES = 4;

private final int SIZE = 128;
Copy link
Contributor

@bric3 bric3 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might be worth adding a comment pointing to where this modulo happens (initialBucketIndex methods)

Should be using adjHash not value.hashCode
- more explanatory comments
- more naming updates: local -> eden
- adding protections against storing large strings in cache
- fixed errant use of CacheEntry.utf8(String) instead of entry.utf8()
- removed unnecessary lookupTimeMs variable
Added tests to verify that big strings are not cached
Copy link
Contributor

@bric3 bric3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-approving

dougqh and others added 3 commits September 4, 2025 09:06
…/Utf8Benchmark.java

Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
- added ability to configure cache size - for both tag names & values
- factored shared code into Caching static utility class
- added tests for Caching class & size determination logic
Copy link
Contributor

@bric3 bric3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the configurable capacity is a nice touch !

@bric3 bric3 force-pushed the dougqh/utf8-caching branch from d6ceffb to bd17af9 Compare September 5, 2025 13:00
@dougqh dougqh enabled auto-merge (squash) September 5, 2025 13:46
@dougqh dougqh merged commit 4abe3ff into master Sep 5, 2025
503 checks passed
@dougqh dougqh deleted the dougqh/utf8-caching branch September 5, 2025 15:08
@github-actions github-actions bot added this to the 1.54.0 milestone Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: performance Performance related changes type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants