Skip to content

Conversation

SungJin1212
Copy link
Member

This PR introduces a dedicated Cortex OTLP converter that directly converts OTLP to Cortex format. It can skip the intermediate conversion from Prometheus format to Cortex for performance.

The converter codes are ported from the prometheus(https://github.com/prometheus/prometheus/tree/main/storage/remote/otlptranslator/prometheusremotewrite)

The old benchmark result: #6977
The new benchmark result:

goos: darwin
goarch: arm64
pkg: github.com/cortexproject/cortex/pkg/util/push
cpu: Apple M1 Max
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:10,_numHistograms:1-10         	   31160	     38642 ns/op	   68348 B/op	     458 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:100,_numHistograms:1-10        	    7344	    159840 ns/op	  182211 B/op	    2562 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1000,_numHistograms:1-10       	     914	   1304548 ns/op	 1385433 B/op	   23288 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:10-10         	   12193	     98351 ns/op	  134651 B/op	    1488 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:100-10        	    1425	    836347 ns/op	  961930 B/op	   13878 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:1000-10       	     142	   8342703 ns/op	 9475827 B/op	  137301 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:10,_samplesPerSeries:1,_numHistograms:1-10         	    9373	    128702 ns/op	  163355 B/op	    1913 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:100,_samplesPerSeries:1,_numHistograms:1-10        	    1096	   1164873 ns/op	 1133418 B/op	   18203 allocs/op
BenchmarkOTLPWriteHandlerPush/numSeries:1000,_samplesPerSeries:1,_numHistograms:1-10       	     100	  11453710 ns/op	12128517 B/op	  180314 allocs/op
PASS
ok  	github.com/cortexproject/cortex/pkg/util/push	11.390s

The comparison result:

goos: darwin
goarch: arm64
pkg: github.com/cortexproject/cortex/pkg/util/push
cpu: Apple M1 Max
                                                                           │    old.txt    │                new.txt                │
                                                                           │    sec/op     │    sec/op     vs base                 │
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:10,_numHistograms:1-10      47.85µ ± ∞ ¹   38.64µ ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:100,_numHistograms:1-10     245.3µ ± ∞ ¹   159.8µ ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1000,_numHistograms:1-10    1.450m ± ∞ ¹   1.305m ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:10-10     166.30µ ± ∞ ¹   98.35µ ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:100-10    1410.8µ ± ∞ ¹   836.3µ ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:1000-10   15.026m ± ∞ ¹   8.343m ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:10,_samplesPerSeries:1,_numHistograms:1-10      156.6µ ± ∞ ¹   128.7µ ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:100,_samplesPerSeries:1,_numHistograms:1-10     1.326m ± ∞ ¹   1.165m ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1000,_samplesPerSeries:1,_numHistograms:1-10    13.20m ± ∞ ¹   11.45m ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                                        818.3µ         596.0µ        -27.16%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                                           │    old.txt     │                new.txt                 │
                                                                           │      B/op      │     B/op       vs base                 │
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:10,_numHistograms:1-10      86.59Ki ± ∞ ¹   66.75Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:100,_numHistograms:1-10     228.7Ki ± ∞ ¹   177.9Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1000,_numHistograms:1-10    1.631Mi ± ∞ ¹   1.321Mi ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:10-10      275.9Ki ± ∞ ¹   131.5Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:100-10    2354.5Ki ± ∞ ¹   939.4Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:1000-10   22.802Mi ± ∞ ¹   9.037Mi ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:10,_samplesPerSeries:1,_numHistograms:1-10      228.9Ki ± ∞ ¹   159.5Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:100,_samplesPerSeries:1,_numHistograms:1-10     1.656Mi ± ∞ ¹   1.081Mi ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1000,_samplesPerSeries:1,_numHistograms:1-10    16.78Mi ± ∞ ¹   11.57Mi ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                                        1.117Mi         696.2Ki        -39.13%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                                           │   old.txt    │                new.txt                │
                                                                           │  allocs/op   │  allocs/op    vs base                 │
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:10,_numHistograms:1-10      546.0 ± ∞ ¹    458.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:100,_numHistograms:1-10    2.643k ± ∞ ¹   2.562k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1000,_numHistograms:1-10   23.38k ± ∞ ¹   23.29k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:10-10     2.233k ± ∞ ¹   1.488k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:100-10    21.20k ± ∞ ¹   13.88k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1,_samplesPerSeries:1,_numHistograms:1000-10   210.3k ± ∞ ¹   137.3k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:10,_samplesPerSeries:1,_numHistograms:1-10     2.091k ± ∞ ¹   1.913k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:100,_samplesPerSeries:1,_numHistograms:1-10    19.29k ± ∞ ¹   18.20k ± ∞ ¹        ~ (p=1.000 n=1) ²
OTLPWriteHandlerPush/numSeries:1000,_samplesPerSeries:1,_numHistograms:1-10   190.4k ± ∞ ¹   180.3k ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                                       11.11k         9.231k        -16.92%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

Which issue(s) this PR fixes:
Fixes #7002

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@dosubot dosubot bot added the type/feature label Sep 5, 2025
@SungJin1212 SungJin1212 force-pushed the Add-cortex-otlp-converter branch 4 times, most recently from cce44e2 to a33b4a8 Compare September 5, 2025 10:39
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the Add-cortex-otlp-converter branch from a33b4a8 to efb1c0d Compare September 10, 2025 07:25
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work. The benchmark results seem promising.

However, I am a bit concerned about all the code we have to copy and vendor ourselves. It seems not easy to maintain and keep it sync with upstream. And there is no tests associated.

I feel there are better ways to allow customizing the OTLP converter for Cortex structs. Maybe we can work with upstream community to expose certain common functions we need to avoid duplicating almost 100% of the upstream OTLP converter logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. I would remove cortex prefix for the package. We can just have a dedicated otlp package and have a converter subpackage or the same package

errs = multierr.Append(errs, fmt.Errorf("empty data points. %s is dropped", metric.Name()))
break
}
if settings.ConvertHistogramsToNHCB {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't support NHCB today via PRWv1. Maybe I am wrong. At least there is no test today to verify its support in Cortex.

For PRWv2 we can support NHCB I guess. But do we have test cases for that?


// map ensures no duplicate label names.
l := make(map[string]string, maxLabelCount)
labelNamer := otlptranslator.LabelNamer{UTF8Allowed: settings.AllowUTF8}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cortex still doesn't support UTF-8 names.


type Settings struct {
Namespace string
ExternalLabels map[string]string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the usecase of Namespace and ExternalLabels? Do we have those configs before in Cortex? If there is no such config before, I recommend removing them if there is no such usecase

ExternalLabels map[string]string
DisableTargetInfo bool
AddMetricSuffixes bool
AllowUTF8 bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cortex doesn't support UTF-8 today

LookbackDelta time.Duration
// PromoteScopeMetadata controls whether to promote OTel scope metadata to metric labels.
PromoteScopeMetadata bool
EnableTypeAndUnitLabels bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clean up the settings a little bit. If there are options that are not in Cortex for now, let's not include them in this PR

@SungJin1212
Copy link
Member Author

SungJin1212 commented Sep 16, 2025

@yeya24
I think we should track this upstream PR (prometheus/prometheus#16951). I'll see if I can implement it.

The upstream exposes a combinedAppender, I think it would be good to implement Cortex's combinedAppender and put it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a dedicated OTLP converter
2 participants