UTF8 caching for v0.4 #9434

dougqh · 2025-08-28T17:09:50Z

What Does This Do

This change adds UTF-8 encoding caching to optimize v0.4 payload construction. (Support for v0.5 will likely be done separately, since it requires more significant changes.)

Since String#getBytes is already highly optimized, so these caches actually perform worse throughput-wise than an uncached encoding. However, the caches are useful in reducing allocation from UTF-8 encoding providing more headroom for the host application - which allows for improvements in throughput of the host application.

For tags, a "simple" cache is used. The simple cache is a single level cache -- that uses hashing combined with linear probing. To avoid, cache churn and unnecessary allocation of a CacheEntry, the simple cache uses a first request marking scheme that typically avoids creating a CacheEntry for values that are requested only once. Eviction from the simple cache is done based on LFU policy.

For tag values, a more complicated generational cache is used. The generational cache combines the delayed CacheEntry creation logic of the simple cache with a 2nd-level for resilience.

Frequently used entries are "promoted" to the higher level cache. The 1st level of the generational cache uses a LFU eviction policy. The 2nd level of the generational cache uses a LRU eviction policy.

For the tag value use case, the generational policy provides a 2x increase in hit rate over the simple cache.

Motivation

This change reduces the memory allocation overhead caused by UTF-8 encoding.
In systems with ample heap, this change provides a 25-75% improvement in throughput reduction by reducing GC cycles.
In systems with limited heap, this change is net neutral.

This change adds UTF-8 encoding caching to optimize v0.4 payload construction. Since String#getBytes is intrinsified these caches actually perform worse throughput wise than an uncached conversion. However, the caches are useful in reducing allocation from UTF-8 conversions. For tags, a "simple" cache is used. The simple cache is a single level cache -- that uses hashing combined with linear probing. To avoid, cache churn and unnecessary allocation of a CacheEntry, the simple cache uses a first request marking scheme that typically avoids creating a CacheEntry for values that are requested only once. Eviction from the "simple" cache is done based on LFU policy. For tag values, a more complicated generational cache is used. The generational cache combines the delayed CacheEntry logic of the simple cache with a 2nd-level for resilience. Frequently used entries are "promoted" to the higher level cache. The 1st level of the generational cache uses a LFU eviction policy. The 2nd of the generational cache uses a LRU eviction policy. For the value use cache, the generational policy provided 2x increase in hit rate over the simple cache.

github-actions · 2025-08-28T17:09:59Z

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

datadog-datadog-prod-us1 · 2025-08-28T17:31:52Z

🎯 Code Coverage
• Patch Coverage: 76.85%
• Total Coverage: 57.65% (-0.06%)

View detailed report

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: c923194 | Docs | Was this helpful? Give us feedback!}

pr-commenter · 2025-08-28T18:19:49Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/utf8-caching
git_commit_date	1757081570	1757081757
git_commit_sha	`cb08250`	`c923194`
release_version	1.54.0-SNAPSHOT~cb08250ba1	1.53.0-SNAPSHOT~c923194dea

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1757083505	1757083505
ci_job_id	1115824912	1115824912
ci_pipeline_id	75663079	75663079
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-3-ey9gmeag 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-3-ey9gmeag 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 48 metrics, 11 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.049 s) : 0, 1049365
Total [baseline] (8.64 s) : 0, 8640041
Agent [candidate] (1.049 s) : 0, 1049146
Total [candidate] (8.622 s) : 0, 8622132
section iast
Agent [baseline] (1.181 s) : 0, 1180887
Total [baseline] (9.376 s) : 0, 9376388
Agent [candidate] (1.179 s) : 0, 1179162
Total [candidate] (9.345 s) : 0, 9344784

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.049 s	-
Agent	iast	1.181 s	131.522 ms (12.5%)
Total	tracing	8.64 s	-
Total	iast	9.376 s	736.347 ms (8.5%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.049 s	-
Agent	iast	1.179 s	130.016 ms (12.4%)
Total	tracing	8.622 s	-
Total	iast	9.345 s	722.652 ms (8.4%)

gantt
    title insecure-bank - break down per module: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.455 ms) : 0, 1455
BytebuddyAgent [baseline] (733.392 ms) : 0, 733392
BytebuddyAgent [candidate] (733.688 ms) : 0, 733688
GlobalTracer [baseline] (242.546 ms) : 0, 242546
GlobalTracer [candidate] (242.79 ms) : 0, 242790
AppSec [baseline] (30.168 ms) : 0, 30168
AppSec [candidate] (30.342 ms) : 0, 30342
Debugger [baseline] (6.12 ms) : 0, 6120
Debugger [candidate] (6.115 ms) : 0, 6115
Remote Config [baseline] (693.257 µs) : 0, 693
Remote Config [candidate] (683.187 µs) : 0, 683
Telemetry [baseline] (13.822 ms) : 0, 13822
Telemetry [candidate] (12.943 ms) : 0, 12943
section iast
crashtracking [baseline] (1.468 ms) : 0, 1468
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (852.548 ms) : 0, 852548
BytebuddyAgent [candidate] (851.452 ms) : 0, 851452
GlobalTracer [baseline] (233.84 ms) : 0, 233840
GlobalTracer [candidate] (232.649 ms) : 0, 232649
AppSec [baseline] (26.277 ms) : 0, 26277
AppSec [candidate] (28.41 ms) : 0, 28410
Debugger [baseline] (7.589 ms) : 0, 7589
Debugger [candidate] (5.833 ms) : 0, 5833
Remote Config [baseline] (606.262 µs) : 0, 606
Remote Config [candidate] (601.351 µs) : 0, 601
Telemetry [baseline] (9.034 ms) : 0, 9034
Telemetry [candidate] (8.31 ms) : 0, 8310
IAST [baseline] (28.455 ms) : 0, 28455
IAST [candidate] (29.443 ms) : 0, 29443

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.053 s) : 0, 1052555
Total [baseline] (10.66 s) : 0, 10659586
Agent [candidate] (1.055 s) : 0, 1054881
Total [candidate] (10.779 s) : 0, 10779448
section appsec
Agent [baseline] (1.224 s) : 0, 1223817
Total [baseline] (10.825 s) : 0, 10824728
Agent [candidate] (1.226 s) : 0, 1225537
Total [candidate] (10.835 s) : 0, 10834796
section iast
Agent [baseline] (1.18 s) : 0, 1179517
Total [baseline] (10.926 s) : 0, 10925848
Agent [candidate] (1.197 s) : 0, 1196517
Total [candidate] (10.956 s) : 0, 10956493
section profiling
Agent [baseline] (1.2 s) : 0, 1200377
Total [baseline] (10.92 s) : 0, 10920058
Agent [candidate] (1.197 s) : 0, 1197289
Total [candidate] (10.957 s) : 0, 10956898

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.053 s	-
Agent	appsec	1.224 s	171.262 ms (16.3%)
Agent	iast	1.18 s	126.962 ms (12.1%)
Agent	profiling	1.2 s	147.822 ms (14.0%)
Total	tracing	10.66 s	-
Total	appsec	10.825 s	165.142 ms (1.5%)
Total	iast	10.926 s	266.262 ms (2.5%)
Total	profiling	10.92 s	260.472 ms (2.4%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.055 s	-
Agent	appsec	1.226 s	170.656 ms (16.2%)
Agent	iast	1.197 s	141.636 ms (13.4%)
Agent	profiling	1.197 s	142.408 ms (13.5%)
Total	tracing	10.779 s	-
Total	appsec	10.835 s	55.348 ms (0.5%)
Total	iast	10.956 s	177.045 ms (1.6%)
Total	profiling	10.957 s	177.45 ms (1.6%)

gantt
    title petclinic - break down per module: candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.466 ms) : 0, 1466
BytebuddyAgent [baseline] (736.799 ms) : 0, 736799
BytebuddyAgent [candidate] (736.685 ms) : 0, 736685
GlobalTracer [baseline] (243.911 ms) : 0, 243911
GlobalTracer [candidate] (243.963 ms) : 0, 243963
AppSec [baseline] (30.513 ms) : 0, 30513
AppSec [candidate] (30.2 ms) : 0, 30200
Debugger [baseline] (6.091 ms) : 0, 6091
Debugger [candidate] (6.099 ms) : 0, 6099
Remote Config [baseline] (698.916 µs) : 0, 699
Remote Config [candidate] (684.464 µs) : 0, 684
Telemetry [baseline] (11.829 ms) : 0, 11829
Telemetry [candidate] (14.567 ms) : 0, 14567
section appsec
crashtracking [baseline] (1.478 ms) : 0, 1478
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (755.039 ms) : 0, 755039
BytebuddyAgent [candidate] (756.641 ms) : 0, 756641
GlobalTracer [baseline] (235.291 ms) : 0, 235291
GlobalTracer [candidate] (235.871 ms) : 0, 235871
AppSec [baseline] (168.4 ms) : 0, 168400
AppSec [candidate] (170.346 ms) : 0, 170346
Debugger [baseline] (8.913 ms) : 0, 8913
Debugger [candidate] (5.788 ms) : 0, 5788
Remote Config [baseline] (627.226 µs) : 0, 627
Remote Config [candidate] (620.475 µs) : 0, 620
Telemetry [baseline] (9.29 ms) : 0, 9290
Telemetry [candidate] (10.031 ms) : 0, 10031
IAST [baseline] (23.645 ms) : 0, 23645
IAST [candidate] (23.588 ms) : 0, 23588
section iast
crashtracking [baseline] (1.472 ms) : 0, 1472
crashtracking [candidate] (1.478 ms) : 0, 1478
BytebuddyAgent [baseline] (851.0 ms) : 0, 851000
BytebuddyAgent [candidate] (865.571 ms) : 0, 865571
GlobalTracer [baseline] (233.08 ms) : 0, 233080
GlobalTracer [candidate] (235.541 ms) : 0, 235541
AppSec [baseline] (26.27 ms) : 0, 26270
AppSec [candidate] (27.605 ms) : 0, 27605
Debugger [baseline] (6.671 ms) : 0, 6671
Debugger [candidate] (6.67 ms) : 0, 6670
Remote Config [baseline] (608.66 µs) : 0, 609
Remote Config [candidate] (599.662 µs) : 0, 600
Telemetry [baseline] (8.345 ms) : 0, 8345
Telemetry [candidate] (8.351 ms) : 0, 8351
IAST [baseline] (30.96 ms) : 0, 30960
IAST [candidate] (29.387 ms) : 0, 29387
section profiling
crashtracking [baseline] (1.434 ms) : 0, 1434
crashtracking [candidate] (1.431 ms) : 0, 1431
BytebuddyAgent [baseline] (762.988 ms) : 0, 762988
BytebuddyAgent [candidate] (761.431 ms) : 0, 761431
GlobalTracer [baseline] (223.142 ms) : 0, 223142
GlobalTracer [candidate] (222.721 ms) : 0, 222721
AppSec [baseline] (30.765 ms) : 0, 30765
AppSec [candidate] (30.388 ms) : 0, 30388
Debugger [baseline] (6.232 ms) : 0, 6232
Debugger [candidate] (6.231 ms) : 0, 6231
Remote Config [baseline] (732.382 µs) : 0, 732
Remote Config [candidate] (690.823 µs) : 0, 691
Telemetry [baseline] (16.553 ms) : 0, 16553
Telemetry [candidate] (16.311 ms) : 0, 16311
ProfilingAgent [baseline] (108.002 ms) : 0, 108002
ProfilingAgent [candidate] (107.603 ms) : 0, 107603
Profiling [baseline] (108.665 ms) : 0, 108665
Profiling [candidate] (108.253 ms) : 0, 108253

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/utf8-caching
git_commit_date	1757081570	1757081757
git_commit_sha	`cb08250`	`c923194`
release_version	1.54.0-SNAPSHOT~cb08250ba1	1.53.0-SNAPSHOT~c923194dea

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1757083179	1757083179
ci_job_id	1115824913	1115824913
ci_pipeline_id	75663079	75663079
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-4-h9vumr0w 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-4-h9vumr0w 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 1 performance regressions! Performance is the same for 9 metrics, 12 unstable metrics.

scenario	Δ mean http_req_duration	Δ mean throughput	candidate mean http_req_duration	candidate mean throughput	baseline mean http_req_duration	baseline mean throughput
scenario:load:insecure-bank:no_agent:high_load	worse [+291.436µs; +401.732µs] or [+7.070%; +9.746%]	unstable [-199.874op/s; +31.124op/s] or [-17.978%; +2.800%]	4.469ms	1027.406op/s	4.122ms	1111.781op/s
scenario:load:petclinic:appsec:high_load	better [-2.049ms; -1.145ms] or [-4.234%; -2.367%]	unstable [-3.597op/s; +10.172op/s] or [-3.719%; +10.518%]	46.792ms	100.000op/s	48.389ms	96.713op/s
scenario:load:petclinic:profiling:high_load	better [-1.927ms; -0.982ms] or [-3.988%; -2.032%]	unstable [-4.060op/s; +10.085op/s] or [-4.193%; +10.415%]	46.867ms	99.850op/s	48.322ms	96.838op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (37.777 ms) : 37477, 38076
.   : milestone, 37777,
appsec (48.389 ms) : 47967, 48811
.   : milestone, 48389,
code_origins (44.699 ms) : 44316, 45081
.   : milestone, 44699,
iast (44.868 ms) : 44478, 45259
.   : milestone, 44868,
profiling (48.322 ms) : 47868, 48775
.   : milestone, 48322,
tracing (43.549 ms) : 43192, 43905
.   : milestone, 43549,
section candidate
no_agent (36.958 ms) : 36654, 37262
.   : milestone, 36958,
appsec (46.792 ms) : 46374, 47209
.   : milestone, 46792,
code_origins (45.682 ms) : 45273, 46091
.   : milestone, 45682,
iast (44.765 ms) : 44382, 45148
.   : milestone, 44765,
profiling (46.867 ms) : 46443, 47291
.   : milestone, 46867,
tracing (44.16 ms) : 43793, 44527
.   : milestone, 44160,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	37.777 ms [37.477 ms, 38.076 ms]	-
appsec	48.389 ms [47.967 ms, 48.811 ms]	10.612 ms (28.1%)
code_origins	44.699 ms [44.316 ms, 45.081 ms]	6.922 ms (18.3%)
iast	44.868 ms [44.478 ms, 45.259 ms]	7.092 ms (18.8%)
profiling	48.322 ms [47.868 ms, 48.775 ms]	10.545 ms (27.9%)
tracing	43.549 ms [43.192 ms, 43.905 ms]	5.772 ms (15.3%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	36.958 ms [36.654 ms, 37.262 ms]	-
appsec	46.792 ms [46.374 ms, 47.209 ms]	9.834 ms (26.6%)
code_origins	45.682 ms [45.273 ms, 46.091 ms]	8.724 ms (23.6%)
iast	44.765 ms [44.382 ms, 45.148 ms]	7.807 ms (21.1%)
profiling	46.867 ms [46.443 ms, 47.291 ms]	9.909 ms (26.8%)
tracing	44.16 ms [43.793 ms, 44.527 ms]	7.202 ms (19.5%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.122 ms) : 4071, 4173
.   : milestone, 4122,
iast (9.243 ms) : 9087, 9399
.   : milestone, 9243,
iast_FULL (13.887 ms) : 13614, 14161
.   : milestone, 13887,
iast_GLOBAL (10.557 ms) : 10371, 10744
.   : milestone, 10557,
profiling (8.769 ms) : 8628, 8911
.   : milestone, 8769,
tracing (7.515 ms) : 7407, 7623
.   : milestone, 7515,
section candidate
no_agent (4.469 ms) : 4417, 4521
.   : milestone, 4469,
iast (9.578 ms) : 9416, 9741
.   : milestone, 9578,
iast_FULL (14.351 ms) : 14064, 14638
.   : milestone, 14351,
iast_GLOBAL (10.321 ms) : 10132, 10510
.   : milestone, 10321,
profiling (8.893 ms) : 8747, 9038
.   : milestone, 8893,
tracing (7.672 ms) : 7562, 7781
.   : milestone, 7672,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	4.122 ms [4.071 ms, 4.173 ms]	-
iast	9.243 ms [9.087 ms, 9.399 ms]	5.121 ms (124.2%)
iast_FULL	13.887 ms [13.614 ms, 14.161 ms]	9.765 ms (236.9%)
iast_GLOBAL	10.557 ms [10.371 ms, 10.744 ms]	6.435 ms (156.1%)
profiling	8.769 ms [8.628 ms, 8.911 ms]	4.647 ms (112.7%)
tracing	7.515 ms [7.407 ms, 7.623 ms]	3.393 ms (82.3%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	4.469 ms [4.417 ms, 4.521 ms]	-
iast	9.578 ms [9.416 ms, 9.741 ms]	5.11 ms (114.3%)
iast_FULL	14.351 ms [14.064 ms, 14.638 ms]	9.883 ms (221.1%)
iast_GLOBAL	10.321 ms [10.132 ms, 10.51 ms]	5.852 ms (131.0%)
profiling	8.893 ms [8.747 ms, 9.038 ms]	4.424 ms (99.0%)
tracing	7.672 ms [7.562 ms, 7.781 ms]	3.203 ms (71.7%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	dougqh/utf8-caching
git_commit_date	1757081570	1757081757
git_commit_sha	`cb08250`	`c923194`
release_version	1.54.0-SNAPSHOT~cb08250ba1	1.53.0-SNAPSHOT~c923194dea

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1757083706	1757083706
ci_job_id	1115824914	1115824914
ci_pipeline_id	75663079	75663079
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-rjlz43am 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-rjlz43am 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.473 ms) : 1461, 1484
.   : milestone, 1473,
appsec (3.621 ms) : 3406, 3836
.   : milestone, 3621,
iast (2.2 ms) : 2137, 2264
.   : milestone, 2200,
iast_GLOBAL (2.242 ms) : 2178, 2306
.   : milestone, 2242,
profiling (2.054 ms) : 2003, 2106
.   : milestone, 2054,
tracing (2.037 ms) : 1987, 2087
.   : milestone, 2037,
section candidate
no_agent (1.472 ms) : 1460, 1484
.   : milestone, 1472,
appsec (3.608 ms) : 3395, 3821
.   : milestone, 3608,
iast (2.202 ms) : 2139, 2266
.   : milestone, 2202,
iast_GLOBAL (2.246 ms) : 2182, 2310
.   : milestone, 2246,
profiling (2.066 ms) : 2013, 2119
.   : milestone, 2066,
tracing (2.016 ms) : 1966, 2065
.   : milestone, 2016,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.473 ms [1.461 ms, 1.484 ms]	-
appsec	3.621 ms [3.406 ms, 3.836 ms]	2.148 ms (145.9%)
iast	2.2 ms [2.137 ms, 2.264 ms]	727.472 µs (49.4%)
iast_GLOBAL	2.242 ms [2.178 ms, 2.306 ms]	768.981 µs (52.2%)
profiling	2.054 ms [2.003 ms, 2.106 ms]	581.549 µs (39.5%)
tracing	2.037 ms [1.987 ms, 2.087 ms]	563.985 µs (38.3%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.472 ms [1.46 ms, 1.484 ms]	-
appsec	3.608 ms [3.395 ms, 3.821 ms]	2.136 ms (145.1%)
iast	2.202 ms [2.139 ms, 2.266 ms]	730.271 µs (49.6%)
iast_GLOBAL	2.246 ms [2.182 ms, 2.31 ms]	774.044 µs (52.6%)
profiling	2.066 ms [2.013 ms, 2.119 ms]	594.101 µs (40.4%)
tracing	2.016 ms [1.966 ms, 2.065 ms]	543.417 µs (36.9%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~c923194dea, baseline=1.54.0-SNAPSHOT~cb08250ba1
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.95 s) : 14950000, 14950000
.   : milestone, 14950000,
appsec (14.913 s) : 14913000, 14913000
.   : milestone, 14913000,
iast (18.504 s) : 18504000, 18504000
.   : milestone, 18504000,
iast_GLOBAL (17.744 s) : 17744000, 17744000
.   : milestone, 17744000,
profiling (15.537 s) : 15537000, 15537000
.   : milestone, 15537000,
tracing (14.887 s) : 14887000, 14887000
.   : milestone, 14887000,
section candidate
no_agent (15.543 s) : 15543000, 15543000
.   : milestone, 15543000,
appsec (14.868 s) : 14868000, 14868000
.   : milestone, 14868000,
iast (18.8 s) : 18800000, 18800000
.   : milestone, 18800000,
iast_GLOBAL (17.985 s) : 17985000, 17985000
.   : milestone, 17985000,
profiling (15.896 s) : 15896000, 15896000
.   : milestone, 15896000,
tracing (14.774 s) : 14774000, 14774000
.   : milestone, 14774000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.95 s [14.95 s, 14.95 s]	-
appsec	14.913 s [14.913 s, 14.913 s]	-37.0 ms (-0.2%)
iast	18.504 s [18.504 s, 18.504 s]	3.554 s (23.8%)
iast_GLOBAL	17.744 s [17.744 s, 17.744 s]	2.794 s (18.7%)
profiling	15.537 s [15.537 s, 15.537 s]	587.0 ms (3.9%)
tracing	14.887 s [14.887 s, 14.887 s]	-63.0 ms (-0.4%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.543 s [15.543 s, 15.543 s]	-
appsec	14.868 s [14.868 s, 14.868 s]	-675.0 ms (-4.3%)
iast	18.8 s [18.8 s, 18.8 s]	3.257 s (21.0%)
iast_GLOBAL	17.985 s [17.985 s, 17.985 s]	2.442 s (15.7%)
profiling	15.896 s [15.896 s, 15.896 s]	353.0 ms (2.3%)
tracing	14.774 s [14.774 s, 14.774 s]	-769.0 ms (-4.9%)

PerfectSlayer

I haven't check at the cache implementation but posting some quick feedback first:

🎯 suggestion: ‏I don't have the full context but should we apply it to the dictionary mapper for V05 too?

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/TraceMapperV0_4.java

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/GenerationalUtf8Cache.java

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java

- implementing review feedback - experimenting with exact hash based marking scheme - fixed issue with not updating entry after hit in simple cache - re-enabling cache by default for benchmarking - spotless

- altered marking strategy to use a bloom filter of previously requested values, once a new entry hits the filter the filter is reset to zero - tweaking cache sizes

…a into dougqh/utf8-caching

dougqh · 2025-09-02T12:44:52Z

I haven't check at the cache implementation but posting some quick feedback first:

🎯 suggestion: ‏I don't have the full context but should we apply it to the dictionary mapper for V05 too?

Yes, I think we should. To do that, I'm going to have to make some bigger changes to v0.5, so I might leave that for another PR.

bantonsson · 2025-09-02T13:01:31Z

Just a question out of the blue. Couldn't the simple cache reuse the FixedSizeCache with UTF8ByteString?

Yeah, I think that's a possibility. I honestly haven't quite determined if the protection against eagerly creating CacheEntry-s is essential for tag names.

I'd started with a different approach where I generated the UTF8 representations of the known tag names first, but I shelved that because it is a bit hard to incorporate / maintain.

- clean-up based on review feedback - making naming consistent - some vestiges of prior names for second level cache updated - tweaked generational cache to check tenured entries first -

…a into dougqh/utf8-caching

- switching generational cache to use different probe lengths for eden vs tenured generation - these settings are neutral or better throughput wise for petclinic for 64m, 80m, 96m, and 128m heaps

…a into dougqh/utf8-caching

bric3

First wave of comments, I haven't looked at GenerationalUtf8Cache yet

bric3 · 2025-09-03T08:59:02Z

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java

+public final class SimpleUtf8Cache implements EncodingCache {
+  private static final int MAX_PROBES = 4;
+
+  private final int SIZE = 128;


question: Why 128 in particular ? And not 256 ?

IIC this needs to be a power of two to be used as a bitmask for the modulo?

Yes, power 2 so that the bitmask calculations works for the bucket calculation.

As for why 128? I mostly just played with cache sizes, probe lengths, etc to come up with something that was good at multiple heap sizes. These values gave nice gains in throughput at higher heap sizes and were neutral on throughput at lower heap sizes.

Or to put it more succinctly, I usually aim to make the cache as small as I can without compromising the hit rate. Admittedly, there is a danger of overfitting to the benchmark load.

Also might be worth adding a comment pointing to where this modulo happens (initialBucketIndex methods)

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java

bric3 · 2025-09-03T12:21:48Z

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java

+      String tag = nextTag();
+      String value = nextValue(tag);


question: Out of curiosity, should it it be better to generate a tag / value dataset outside the the benchmark methods ? Maybe this could allow to have datasets with wider range of values.

I believe some customers have wide chars values (e.g. in korean) in their tag, would it be useful to have a benchmark for that, could the gains be more pronounced in this case ?

Yeah, probably. I need to experiment some more to figure out what's possible with JMH.
The x_baseline methods exist, so that I can do a comparison to the "same" logic without the encoding.

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/GenerationalUtf8Cache.java

bric3 · 2025-09-03T12:43:59Z

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/GenerationalUtf8Cache.java

+    newEntry.hit(lookupTimeMs);
+    newEntry.hit(lookupTimeMs);


question: Won't it amplify hits anytime this entry is "accessed" after the first use (which is a mark) ?

Yes, in a sense, this provides the incorrect first access time, but only the last access time is stored.
Also, I'm not precisely tracking time because I don't want to constantly call System#currentTimeMills.
Instead I just update the access time once each time a payload is being constructed, so times are somewhat deliberately imprecise.

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/GenerationalUtf8Cache.java

bric3 · 2025-09-03T13:01:56Z

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java

+public final class SimpleUtf8Cache implements EncodingCache {
+  private static final int MAX_PROBES = 4;
+
+  private final int SIZE = 128;


Also might be worth adding a comment pointing to where this modulo happens (initialBucketIndex methods)

Should be using adjHash not value.hashCode

- more explanatory comments - more naming updates: local -> eden

- adding protections against storing large strings in cache - fixed errant use of CacheEntry.utf8(String) instead of entry.utf8() - removed unnecessary lookupTimeMs variable

…a into dougqh/utf8-caching

Added tests to verify that big strings are not cached

bric3

Pre-approving

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java

…/Utf8Benchmark.java Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>

- added ability to configure cache size - for both tag names & values - factored shared code into Caching static utility class - added tests for Caching class & size determination logic

bric3

LGTM, the configurable capacity is a nice touch !

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java

dougqh added 2 commits August 28, 2025 13:05

spotless

68fdcb9

dougqh requested a review from a team as a code owner August 28, 2025 17:09

dougqh requested a review from ygree August 28, 2025 17:09

dougqh added the tag: performance Performance related changes label Aug 28, 2025

dougqh added the type: enhancement Enhancements and improvements label Aug 28, 2025

Tweaking comments

95767a6

dougqh added the comp: platform Platform label Aug 28, 2025

Tweaking comments

5270f9c

dougqh added 2 commits August 28, 2025 14:34

Comparing results with caching off

69c4983

Merge branch 'master' into dougqh/utf8-caching

d725543

PerfectSlayer added comp: core Tracer core and removed comp: platform Platform labels Aug 29, 2025

PerfectSlayer reviewed Aug 29, 2025

View reviewed changes

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/TraceMapperV0_4.java Outdated Show resolved Hide resolved

AlexeyKuznetsov-DD reviewed Aug 29, 2025

View reviewed changes

dougqh added 6 commits August 29, 2025 10:29

Fixing silly oversight when cache is disabled

ebc3fb0

Adding comments about benchmark data being used

247bb02

Misc improvements

69c94d1

- implementing review feedback - experimenting with exact hash based marking scheme - fixed issue with not updating entry after hit in simple cache - re-enabling cache by default for benchmarking - spotless

Merge branch 'master' into dougqh/utf8-caching

d017b02

Tweaking the cache heuristics

01aa284

- altered marking strategy to use a bloom filter of previously requested values, once a new entry hits the filter the filter is reset to zero - tweaking cache sizes

Merge branch 'dougqh/utf8-caching' of github.com:DataDog/dd-trace-jav…

bde8118

…a into dougqh/utf8-caching

dougqh added 2 commits September 2, 2025 08:51

spotless

f15e1cc

Merge branch 'master' into dougqh/utf8-caching

947734a

dougqh added 2 commits September 2, 2025 14:30

Clean-up & tweaking

f509c0a

- clean-up based on review feedback - making naming consistent - some vestiges of prior names for second level cache updated - tweaked generational cache to check tenured entries first -

Merge branch 'dougqh/utf8-caching' of github.com:DataDog/dd-trace-jav…

6bfbf88

…a into dougqh/utf8-caching

dougqh added 2 commits September 2, 2025 16:30

Tweaking settings to be good at multiple memory levels

db82394

- switching generational cache to use different probe lengths for eden vs tenured generation - these settings are neutral or better throughput wise for petclinic for 64m, 80m, 96m, and 128m heaps

Merge branch 'dougqh/utf8-caching' of github.com:DataDog/dd-trace-jav…

ff6e0f8

…a into dougqh/utf8-caching

bric3 reviewed Sep 3, 2025

View reviewed changes

dougqh added 9 commits September 3, 2025 12:16

Fixing oversight from marking change

41d059d

Should be using adjHash not value.hashCode

Fixing bug introduced with different probes lengths for eden & tenured

3b69e62

More clean-up

4102a26

- more explanatory comments - more naming updates: local -> eden

Merge branch 'master' into dougqh/utf8-caching

6902e80

Misc fixes

9b78df7

- adding protections against storing large strings in cache - fixed errant use of CacheEntry.utf8(String) instead of entry.utf8() - removed unnecessary lookupTimeMs variable

Fixing benchmarks brought over from standalone prototype

3c33c38

Merge branch 'dougqh/utf8-caching' of github.com:DataDog/dd-trace-jav…

41af3df

…a into dougqh/utf8-caching

test & benchmark clean-up

0b9f0d0

Added tests to verify that big strings are not cached

Added some explanatory comments

bdc1859

bric3 approved these changes Sep 4, 2025

View reviewed changes

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java Outdated Show resolved Hide resolved

dd-trace-core/src/main/java/datadog/trace/common/writer/ddagent/SimpleUtf8Cache.java Outdated Show resolved Hide resolved

dougqh and others added 3 commits September 4, 2025 09:06

Update dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent…

6ab19b0

…/Utf8Benchmark.java Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>

Making cache more configurable & clean-up

49100cb

- added ability to configure cache size - for both tag names & values - factored shared code into Caching static utility class - added tests for Caching class & size determination logic

Merge branch 'master' into dougqh/utf8-caching

75bff75

bric3 approved these changes Sep 4, 2025

View reviewed changes

AlexeyKuznetsov-DD approved these changes Sep 4, 2025

View reviewed changes

bric3 reviewed Sep 4, 2025

View reviewed changes

dd-trace-core/src/jmh/java/datadog/trace/common/writer/ddagent/Utf8Benchmark.java Outdated Show resolved Hide resolved

fix: small compilation fix

bd17af9

bric3 force-pushed the dougqh/utf8-caching branch from d6ceffb to bd17af9 Compare September 5, 2025 13:00

Merge branch 'master' into dougqh/utf8-caching

f53ed6e

dougqh enabled auto-merge (squash) September 5, 2025 13:46

dougqh added 2 commits September 5, 2025 10:08

Merge branch 'master' into dougqh/utf8-caching

6d035cc

Adding missing size parameters tp benchmark

c923194

dougqh merged commit 4abe3ff into master Sep 5, 2025
503 checks passed

dougqh deleted the dougqh/utf8-caching branch September 5, 2025 15:08

github-actions bot added this to the 1.54.0 milestone Sep 5, 2025

dougqh mentioned this pull request Sep 5, 2025

Fixing oversight of missing synchronized on long version of recalibrate #9480

Merged

UTF8 caching for v0.4 #9434

UTF8 caching for v0.4 #9434

Uh oh!

Conversation

dougqh commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Uh oh!

github-actions bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datadog-datadog-prod-us1 bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

PerfectSlayer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dougqh commented Sep 2, 2025

Uh oh!

bantonsson commented Sep 2, 2025 • edited by dougqh Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bric3 left a comment

Choose a reason for hiding this comment

Uh oh!

bric3 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

dougqh Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

bric3 Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bric3 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

dougqh Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bric3 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

dougqh Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dougqh commented Aug 28, 2025 •

edited

Loading

github-actions bot commented Aug 28, 2025 •

edited

Loading

datadog-datadog-prod-us1 bot commented Aug 28, 2025 •

edited

Loading

pr-commenter bot commented Aug 28, 2025 •

edited

Loading

bantonsson commented Sep 2, 2025 •

edited by dougqh

Loading

bric3 Sep 3, 2025 •

edited

Loading

bric3 Sep 3, 2025 •

edited

Loading