Skip to content

Proposal: Expose bytes from TraceID #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 29, 2024
Merged

Conversation

t089
Copy link
Contributor

@t089 t089 commented Aug 12, 2024

I would suggest to expose the "fixed-width 16 byte array" directly on TraceID and SpanID for performance reasons. Otherwise, clients will too easily go through the var bytes: [UInt8] property which causes an allocation on every access. Of course this is a breaking change of the API, but the performance win seems worthwhile.

For example, while implementing a TraceIdRatioBasedSampler here I had to access the last 8 bytes of the TraceID to determine if the sample should be dropped or recorded. When going through the var bytes: [UInt8] this causes an allocation on each check. If I use the approach presented here, there are 0 allocations.

I also changed the conformance to Identifiable to use the type itself as the "id".

With Benchmark I was able to measure the difference:

Benchmark

let benchmarks = {
    Benchmark("OTelTraceIdRatioBasedSampler") { benchmark in

        let sampler = OTelTraceIdRatioBasedSampler(ratio: 0.5)
        for _ in benchmark.scaledIterations {
            _ = sampler.samplingResult(
                operationName: "some-op",
                kind: .internal,
                traceID: .random(),
                attributes: [:],
                links: [],
                parentContext: .topLevel)
        }

    }
}

With [UInt8]

=====================================
OTelTraceIdRatioBasedSamplerBenchmark
=====================================

OTelTraceIdRatioBasedSampler
╒═══════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                        │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Instructions *                │    1317 │    6647 │    6647 │    6647 │    6647 │    9623 │   15592 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *              │       1 │       1 │       1 │       1 │       1 │       1 │       1 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Memory (resident peak) (K)    │    9126 │    9511 │    9511 │    9527 │    9552 │    9552 │    9552 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)        │     631 │     600 │     586 │     585 │     522 │     304 │      69 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) *       │    3874 │    4167 │    4211 │    4251 │    4583 │    7211 │   26916 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ns) *      │    1584 │    1667 │    1708 │    1709 │    1917 │    3293 │   14500 │   10000 │
╘═══════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

With unsafeBytes

=====================================
OTelTraceIdRatioBasedSamplerBenchmark
=====================================

OTelTraceIdRatioBasedSampler
╒═══════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                        │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Instructions *                │    2275 │    2279 │    2279 │    2279 │    2279 │    5255 │   10018 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *              │       0 │       0 │       0 │       0 │       0 │       0 │       0 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Memory (resident peak) (K)    │    9159 │    9527 │    9527 │    9544 │    9568 │    9568 │    9568 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Throughput (# / s) (K)        │    1333 │    1201 │    1201 │    1199 │    1091 │     586 │      58 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) *       │    2917 │    3209 │    3209 │    3251 │    3541 │    6127 │   37125 │   10000 │
├───────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ns) *      │     750 │     833 │     833 │     834 │     917 │    1667 │   17250 │   10000 │
╘═══════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

The throughput roughly doubles, which I would consider a great win for a tracing library which usually is executed in hot code paths.

Copy link
Member

@slashmo slashmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @t089 for the performance improvement and for providing benchmarks as well 🙏 The changes look solid to me, with a couple of nitpicks regarding public API, documentation, and formatting inline.

@codecov-commenter
Copy link

codecov-commenter commented Sep 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.50%. Comparing base (aeeea07) to head (8f6ea50).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #18   +/-   ##
=======================================
  Coverage   99.50%   99.50%           
=======================================
  Files           6        6           
  Lines         402      402           
=======================================
  Hits          400      400           
  Misses          2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@t089
Copy link
Contributor Author

t089 commented Sep 16, 2024

Hm thinking about it again, I wonder if the nested type Bytes really pulls its weight?!

Wouldn't it be enough to just remove the current bytes: [UInt8] property, add the withUnsafeBytes instead and maybe just make the {Span|Trace}ID type itself conform to Collection and continue to use the tuple as its backing?

@t089
Copy link
Contributor Author

t089 commented Sep 16, 2024

Went ahead with this idea in #19 and it results in a much simpler API. I would suggest to review #19 instead.

@t089 t089 requested a review from slashmo September 25, 2024 19:04
@t089
Copy link
Contributor Author

t089 commented Sep 28, 2024

@slashmo updated the API. Main idea is, if you use "bytes" to create an id you should be able to get the bytes out again. "Bytes" are a fixed size collection of bytes, which internally map to a tuple.

Copy link
Member

@slashmo slashmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! 👏 Do we still need to expose the unsafe APIs publicly now that we can subscript into the Span/TraceID.Bytes directly? I prefer to start with as little public API as possible where it makes sense.

@slashmo slashmo merged commit 3da4b79 into swift-otel:main Sep 29, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants