Proposal Amendment: UTF-8 migration #30

ywwg · 2023-11-15T20:10:33Z

This proposal expands upon the existing UTF-8 proposal, providing more detail into how we plan to handle the migration of metrics data, ensuring that it remains queryable during the transition period. This is critical to smooth migration of OTEL data.

ywwg · 2023-11-15T20:10:57Z

TODO: need to clarify expansion method, otherwise first draft ready to go

beorn7 · 2023-11-16T21:50:52Z

I added @roidelapluie and @bwplotka as reviewers of the original UTF-8 design doc, and @gouthamve as he was involved discussing the approach.

bwplotka

Nice, this is quite a smart way to have better UX on those, often already ingested, Otel yolo underscore metrics. Great work!

The collisions are tricky case here indeed. Hard to say if there will be case that might surprise users. We don't have specific examples that semantically makes sense. Given we cannot rule out the surprises completely, I would vote for dropping that storage trick. It's quite a complexity, yet still prone to surprise situations. Additionally, if we have block with mixed client versions (prone to collisions) and it's get compacted into 2 days or 2w block (e.g. Thanos), that means whole 2w is prone to collisions right? Or we change compaction code too?

I would say let's make unsupported input for this migration feature explicitly unsupported, not "sometimes" unsupported. I wonder if we can simply rely on the query API parameter (and feature flag) to allow some debuggability/quick opt-out, without storage deduction and be explicit in docs etc

Which brings an interesting question, would this feature be part of an official PromQL spec? (aka require every PromQL exposed backend to have "UTF-8 compatibility broad" lookup, to be compatible) or is it an optional feature?

Having explicit regex makes sense.

Quick suggestions:

Put storage trick to alternatives as some future improvement if we know about specific collision case which makes sense.
Propose query API param (?)
Clarify if this is part of the official PromQL spec or just a migration friendly Prometheus feature

bwplotka · 2023-11-17T01:43:19Z

proposals/2023-11-13-utf8-migration.md

+
+### Proposed Solution
+
+To help alleviate this confusion we first propose to bump the version number in the tsdb meta.json file. On a per-block basis, the query code can check the version number and know if the data was written with an old version of the database code. This helps distinguish the first case.


Technically we don't need to bump as the existence of the new entry would tell us if new or old code was used, right? 🤔

The version number bump indicates if the block was written by utf-capable prometheus, but it does not indicate if the clients sent utf8 metrics or pre-escaped metrics (or if the block contains a combination of both).

gouthamve

Thinking more, I think the implementation could get complicated.

Right now, the TSDB interface has no concept of "ingestion-version", and I would be uncomfortable plumbing it through for just this use-case.

Instead could we do the following:

-promql.utf8_migration.enabled=true
-promql.utf8_migration.until=<date-time> (optional)

The migration would then become:

Set the first flag
Move everything over to UTF-8
Wait a bit
Set the second flag
Once retention is passed, unset the first flag

This might work tbh. The challenge is likely to be that step 2. will take a long time as most folks don't have control over the clients.

gouthamve · 2023-11-19T10:41:24Z

proposals/2023-11-13-utf8-migration.md

+
+We must consider edge cases in which a blocks database has persisted metrics or labels that may have been written by different client versions. There are multiple ways this can (and will) happen:
+
+* A newer client persists names to an older database version. In this case, names would be escaped with the U__ syntax.  If the database is upgraded, newer blocks will be written in UTF-8.


What do you mean by database upgrade? Do you mean a new Prometheus version?

yes that's what I meant here, I can fix the language

bwplotka · 2023-11-20T20:37:18Z

Yea, I like that simple logic @gouthamve proposed.

ywwg · 2023-11-20T21:13:44Z

-promql.utf8_migration.enabled=true

is the idea that 100% of queries would get replacement-expansion, rather than trying to figure out which blocks are which?

ywwg · 2023-11-20T21:14:20Z

or does the date-time determine for what periods the expansion will take place?

bwplotka · 2023-11-20T22:23:55Z

Both (:

If the -promql.utf8_migration.until=<date-time> is not provided, we query all blocks with extended lookup. If it's provided we perform extended lookup only until given time. (If I understand @gouthamve correctly).

gouthamve · 2023-11-21T02:09:47Z

Yup, what Bartek explained.

ywwg · 2023-11-21T15:11:40Z

Addressed notes

bwplotka

Epic, thanks! Final suggestions 🤗 (I hope), otherwise LGTM

proposals/2023-11-13-utf8-migration.md

bwplotka · 2023-11-21T21:56:49Z

proposals/2023-11-13-utf8-migration.md

+### Proposed Solution
+
+For queries to return correct data we must differentiate the three cases above, and to do that we first propose to bump the version number in the tsdb meta.json file.
+On a per-block basis, the query code can check the version number and know if the data was written with an old version of the Prometheus code.


I am not sure how this is useful. What query will do if it sees old block? When UTF-8 is queried? I think we need to still do this "migration broad search" for those, no? As we don't know what escape method clients used.

Is it as an optimization to immediately say no result on those blocks for non escaped UTF-8 lookup?

Yes I suppose the only real benefit is removing the UTF-8 query from the list of possibilities. The other escapings may all be possible. Do you think that maybe we don't need a version number bump at all and can just use the flag/date-based logic?

Yea, I am fine with both.

It feels like small change to bump version and might unlock some optimizations, so I am fine keeping this up, just wanted to clarify.

So yeah, I guess the only optimization is that we never have to run the original UTF-8 query on blocks with an older version. Not sure if that is worth the effort of versioning the blocks. But maybe, as @bwplotka said, the effort is actually quite low, and we might see additional uses of the version number later.

proposals/2023-11-13-utf8-migration.md

This proposal expands upon the existing UTF-8 proposal, providing more detail into how we plan to handle the migration of metrics data, ensuring that it remains queryable during the transition period Signed-off-by: Owen Williams <owen.williams@grafana.com>

Signed-off-by: Owen Williams <owen.williams@grafana.com>

Change approach, use flags instead of trying to auto-detect mixed blocks. Signed-off-by: Owen Williams <owen.williams@grafana.com>

Signed-off-by: Owen Williams <owen.williams@grafana.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Owen Williams <owen-github@ywwg.com> Signed-off-by: Owen Williams <owen.williams@grafana.com>

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg · 2023-11-22T19:54:44Z

had to fix some commit signing, hopefully I didn't break anything

bwplotka

LGTM, thanks! 💪🏽

proposals/2023-11-13-utf8-migration.md

bwplotka · 2023-11-27T10:36:58Z

Which brings an interesting question, would this feature be part of an official PromQL spec? (aka require every PromQL exposed backend to have "UTF-8 compatibility broad" lookup, to be compatible) or is it an optional feature?

I assume the answer is NO, it's optional backend feature and not part of the PromQL spec (:

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg · 2023-11-27T15:31:03Z

I assume the answer is NO, it's optional backend feature and not part of the PromQL spec (:

I agree, I think this is an optional feature and not a requirement.

beorn7

Thank you very much. I think this is substantially good to go. My comments are mostly about wordsmithing and clarifications.

proposals/2023-11-13-utf8-migration.md

beorn7 · 2023-11-29T17:21:39Z

proposals/2023-11-13-utf8-migration.md

+
+All of these situations can be summarized as follows:
+
+1. **Old Data** -- Data written with old Prometheus code: all names are guaranteed not to be UTF-8.


This sounds like it includes the case where an old Prometheus has ingested from new producers (and names might include escaped names).

Or is this referring to old Prometheus and old producers, so even escaping in names can be ruled out?

Could you clarify?

After having read through all the below, I would say what's meant here is "if at all, we will have escaped names, so we would never query for UTF-8 names, but only try out the specified escaping schemas".

Still, I think the difference to the mixed data case should be made clearer.

In principle, there might be a scenario where the user knows for sure that they won't have any escaping at all, for example if they had a pure Prometheus stack so far (no OTel etc.), but they would like to use UTF-8 names, so they still have a period of mixed versions deployed (new and old producers, new and old ingesters). I guess it's fine to not implement a specific optimization for that use case (which would be that we don't need a broad search for the old blocks at all), but it would be good if that case is described and that we disregard it as an informed decision.

proposals/2023-11-13-utf8-migration.md

beorn7 · 2023-11-29T19:22:18Z

proposals/2023-11-13-utf8-migration.md

+### Proposed Solution
+
+For queries to return correct data we must differentiate the three cases above, and to do that we first propose to bump the version number in the tsdb meta.json file.
+On a per-block basis, the query code can check the version number and know if the data was written with an old version of the Prometheus code.


So yeah, I guess the only optimization is that we never have to run the original UTF-8 query on blocks with an older version. Not sure if that is worth the effort of versioning the blocks. But maybe, as @bwplotka said, the effort is actually quite low, and we might see additional uses of the version number later.

proposals/2023-11-13-utf8-migration.md

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg · 2023-11-30T16:29:22Z

all notes addressed, please let me know if I missed anything

ywwg · 2023-12-04T18:33:59Z

merge ping?

beorn7

LGTM. I assume all other reviewers are also happy at this point. Please follow up if not.

ywwg marked this pull request as ready for review November 16, 2023 19:29

ywwg force-pushed the owilliams/utf8 branch from aea62ff to 3cf503c Compare November 16, 2023 19:44

beorn7 requested review from gouthamve, roidelapluie and bwplotka November 16, 2023 21:49

dashpole mentioned this pull request Nov 17, 2023

Determine impact of Prometheus UTF-8 support on OTel compatibility open-telemetry/opentelemetry-specification#3736

Open

bwplotka reviewed Nov 17, 2023

View reviewed changes

gouthamve reviewed Nov 19, 2023

View reviewed changes

ywwg requested review from gouthamve and bwplotka November 21, 2023 15:11

bwplotka reviewed Nov 21, 2023

View reviewed changes

ywwg force-pushed the owilliams/utf8 branch from e153343 to 987ae2b Compare November 22, 2023 19:53

ywwg and others added 8 commits November 22, 2023 14:53

Add note about ddog proxy replacement scheme

d05fb21

Signed-off-by: Owen Williams <owen.williams@grafana.com>

Address review notes

32a0a92

Change approach, use flags instead of trying to auto-detect mixed blocks. Signed-off-by: Owen Williams <owen.williams@grafana.com>

Fix a reference

3839413

Signed-off-by: Owen Williams <owen.williams@grafana.com>

More cleanup

a25ad06

Signed-off-by: Owen Williams <owen.williams@grafana.com>

clarify a mixed-block scenario

808c7ff

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Owen Williams <owen-github@ywwg.com> Signed-off-by: Owen Williams <owen.williams@grafana.com>

tweak utf8 flag to list formats to look for

b7480de

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Owen Williams <owen-github@ywwg.com> Signed-off-by: Owen Williams <owen.williams@grafana.com>

Tweaks based on review notes

31d334f

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg force-pushed the owilliams/utf8 branch from 987ae2b to 31d334f Compare November 22, 2023 19:54

bwplotka approved these changes Nov 27, 2023

View reviewed changes

proposals/2023-11-13-utf8-migration.md Outdated Show resolved Hide resolved

proposals/2023-11-13-utf8-migration.md Outdated Show resolved Hide resolved

proposals/2023-11-13-utf8-migration.md Outdated Show resolved Hide resolved

bwplotka mentioned this pull request Nov 27, 2023

UTF-8 Support in Thanos thanos-io/thanos#6931

Open

Rename flag per note

c6b62e5

Signed-off-by: Owen Williams <owen.williams@grafana.com>

ywwg requested a review from bwplotka November 27, 2023 19:06

beorn7 requested changes Nov 29, 2023

View reviewed changes

Address cleanup notes

902040a

Signed-off-by: Owen Williams <owen.williams@grafana.com>

beorn7 approved these changes Dec 5, 2023

View reviewed changes

beorn7 merged commit d3e3498 into prometheus:main Dec 5, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal Amendment: UTF-8 migration #30

Proposal Amendment: UTF-8 migration #30

ywwg commented Nov 15, 2023 •

edited

Loading

ywwg commented Nov 15, 2023

beorn7 commented Nov 16, 2023

bwplotka left a comment •

edited

Loading

bwplotka Nov 17, 2023

ywwg Nov 22, 2023

gouthamve left a comment

gouthamve Nov 19, 2023

ywwg Nov 20, 2023

ywwg Nov 21, 2023

bwplotka commented Nov 20, 2023

ywwg commented Nov 20, 2023

ywwg commented Nov 20, 2023

bwplotka commented Nov 20, 2023 •

edited

Loading

gouthamve commented Nov 21, 2023

ywwg commented Nov 21, 2023

bwplotka left a comment

bwplotka Nov 21, 2023

ywwg Nov 22, 2023

bwplotka Nov 27, 2023

beorn7 Nov 29, 2023

ywwg commented Nov 22, 2023

bwplotka left a comment

bwplotka commented Nov 27, 2023

ywwg commented Nov 27, 2023

beorn7 left a comment

beorn7 Nov 29, 2023

beorn7 Nov 29, 2023

beorn7 Nov 29, 2023

ywwg commented Nov 30, 2023

ywwg commented Dec 4, 2023

beorn7 left a comment


		### Proposed Solution

		To help alleviate this confusion we first propose to bump the version number in the tsdb meta.json file. On a per-block basis, the query code can check the version number and know if the data was written with an old version of the database code. This helps distinguish the first case.


		We must consider edge cases in which a blocks database has persisted metrics or labels that may have been written by different client versions. There are multiple ways this can (and will) happen:

		* A newer client persists names to an older database version. In this case, names would be escaped with the U__ syntax. If the database is upgraded, newer blocks will be written in UTF-8.


		All of these situations can be summarized as follows:

		1. Old Data -- Data written with old Prometheus code: all names are guaranteed not to be UTF-8.

Proposal Amendment: UTF-8 migration #30

Proposal Amendment: UTF-8 migration #30

Conversation

ywwg commented Nov 15, 2023 • edited Loading

ywwg commented Nov 15, 2023

beorn7 commented Nov 16, 2023

bwplotka left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gouthamve left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka commented Nov 20, 2023

ywwg commented Nov 20, 2023

ywwg commented Nov 20, 2023

bwplotka commented Nov 20, 2023 • edited Loading

gouthamve commented Nov 21, 2023

ywwg commented Nov 21, 2023

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywwg commented Nov 22, 2023

bwplotka left a comment

Choose a reason for hiding this comment

bwplotka commented Nov 27, 2023

ywwg commented Nov 27, 2023

beorn7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywwg commented Nov 30, 2023

ywwg commented Dec 4, 2023

beorn7 left a comment

Choose a reason for hiding this comment

ywwg commented Nov 15, 2023 •

edited

Loading

bwplotka left a comment •

edited

Loading

bwplotka commented Nov 20, 2023 •

edited

Loading