Skip to content

chore: change dsm hashing algorithm to match other tracers #4222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

wconti27
Copy link
Contributor

@wconti27 wconti27 commented Apr 5, 2024

What does this PR do?

Makes DSM hashing consistent with other languages. Not a breaking change. Hashing algo has been changes to use fnv1 algorithm. The hashing code was based off the python implementation

Motivation

Plugin Checklist

Additional Notes

Copy link

github-actions bot commented Apr 5, 2024

Overall package size

Self size: 6.27 MB
Deduped: 60.77 MB
No deduping: 61.05 MB

Dependency sizes

name version self size total size
@datadog/native-iast-taint-tracking 1.7.0 16.71 MB 16.72 MB
@datadog/native-appsec 7.1.1 14.39 MB 14.4 MB
@datadog/pprof 5.2.0 8.84 MB 9.21 MB
protobufjs 7.2.5 2.77 MB 6.56 MB
@datadog/native-iast-rewriter 2.3.0 2.15 MB 2.24 MB
@opentelemetry/core 1.14.0 872.87 kB 1.47 MB
@datadog/native-metrics 2.0.0 898.77 kB 1.3 MB
@opentelemetry/api 1.4.1 780.32 kB 780.32 kB
import-in-the-middle 1.7.3 67.62 kB 731.01 kB
msgpack-lite 0.1.26 201.16 kB 281.59 kB
opentracing 0.14.7 194.81 kB 194.81 kB
semver 7.5.4 93.4 kB 123.8 kB
pprof-format 2.1.0 111.69 kB 111.69 kB
@datadog/sketches-js 2.1.0 109.9 kB 109.9 kB
lodash.sortby 4.7.0 75.76 kB 75.76 kB
lru-cache 7.14.0 74.95 kB 74.95 kB
ipaddr.js 2.1.0 60.23 kB 60.23 kB
ignore 5.2.4 51.22 kB 51.22 kB
int64-buffer 0.1.10 49.18 kB 49.18 kB
shell-quote 1.8.1 44.96 kB 44.96 kB
istanbul-lib-coverage 3.2.0 29.34 kB 29.34 kB
tlhunter-sorted-set 0.1.0 24.94 kB 24.94 kB
limiter 1.1.5 23.17 kB 23.17 kB
dc-polyfill 0.1.4 23.1 kB 23.1 kB
retry 0.13.1 18.85 kB 18.85 kB
node-abort-controller 3.1.1 16.89 kB 16.89 kB
jest-docblock 29.7.0 8.99 kB 12.76 kB
crypto-randomuuid 1.0.0 11.18 kB 11.18 kB
path-to-regexp 0.1.7 6.78 kB 6.78 kB
koalas 1.0.2 6.47 kB 6.47 kB
methods 1.1.2 5.29 kB 5.29 kB
module-details-from-path 1.0.3 4.47 kB 4.47 kB

🤖 This report was automatically generated by heaviest-objects-in-the-universe

Copy link

codecov bot commented Apr 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.79%. Comparing base (de47a4b) to head (7db89db).
Report is 710 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4222      +/-   ##
==========================================
- Coverage   85.23%   78.79%   -6.44%     
==========================================
  Files         247       14     -233     
  Lines       10961     1066    -9895     
  Branches       33       33              
==========================================
- Hits         9343      840    -8503     
+ Misses       1618      226    -1392     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wconti27 wconti27 changed the title chore: change encoding / decoding of hashes to match other tracers chore: change encoding / decoding of dsm hashes to match other tracers Apr 9, 2024
@wconti27 wconti27 changed the title chore: change encoding / decoding of dsm hashes to match other tracers chore: change dsm hashing algorithm to match other tracers Apr 9, 2024
@pr-commenter
Copy link

pr-commenter bot commented Apr 9, 2024

Benchmarks

Benchmark execution time: 2024-04-09 19:54:04

Comparing candidate commit 7db89db in PR branch conti/change-dsm-hashing-to-match-other-tracers with baseline commit de47a4b in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

@wconti27 wconti27 marked this pull request as ready for review May 2, 2024 14:16
@wconti27 wconti27 requested a review from a team as a code owner May 2, 2024 14:16
@wconti27 wconti27 requested review from a team as code owners May 2, 2024 14:16
Copy link
Member

@rochdev rochdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a glance, but I have a few questions before approving.

First, is this propagated horizontally, and if yes, wouldn't that be a breaking change?

Second, I thought the algorithm didn't matter which is why we went with the one that was simpler for the language. Has that changed?

@piochelepiotr
Copy link
Contributor

LGTM at a glance, but I have a few questions before approving.

First, is this propagated horizontally, and if yes, wouldn't that be a breaking change?

Second, I thought the algorithm didn't matter which is why we went with the one that was simpler for the language. Has that changed?

The backend will handle that correctly, no matter which way the change is deployed.
It's true that the hashing algorithm is not critical, Node.js was the only one that was different though, so it's nice to have the consistency :)

Copy link

This pull request has been marked as stale due to 90 days of inactivity.
If this is still relevant, please update or comment to keep it open.
If this should be kept open indefinitely, please apply the label keep-open.
Otherwise, it will be automatically closed after 14 days.

@github-actions github-actions bot added the stale label Feb 21, 2025
@BridgeAR BridgeAR requested a review from rochdev February 24, 2025 21:25
@github-actions github-actions bot removed the stale label Feb 25, 2025
@rochdev
Copy link
Member

rochdev commented Feb 26, 2025

It's true that the hashing algorithm is not critical, Node.js was the only one that was different though, so it's nice to have the consistency :)

In a case like this I'd prioritize simpler code (so less overhead) instead of consistency if it doesn't otherwise matter.

@BridgeAR
Copy link
Collaborator

While I generally believe it's good to align behavior when it deviates, I would not expect the hashing algorithm to be important for someone to handle the value. Any value should be handled in an identical way.

I believe these cases should be checked one by one. In this case, I agree with @rochdev and would just keep it as is.
@wconti27 how was the difference actually detected, if I may ask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants