Replace regex with semver crate#1108
Conversation
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
| } | ||
| } | ||
| // Try to infer the manifest from the registry path by detecting a | ||
| // valid semantic version segment (e.g., v1.26.0) in the path. |
There was a problem hiding this comment.
I've changed the comment here since semver accepts more than just v\d+\.\d+\.\d+, e.g. release candidates
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1108 +/- ##
=====================================
Coverage 79.1% 79.1%
=====================================
Files 102 102
Lines 7943 7963 +20
=====================================
+ Hits 6284 6306 +22
+ Misses 1659 1657 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| /// | ||
| /// * `Some(String)` - The version string including the 'v' prefix (e.g., "v1.26.0") | ||
| /// * `None` - If no valid semantic version is found in the path | ||
| fn extract_semconv_version(path: &str) -> Option<String> { |
There was a problem hiding this comment.
My only concern here (and also one with the previous regex) is that the v is not required, see: https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/schemas#schema-url
There was a problem hiding this comment.
Hmmmm, gotcha. Reading the link you shared here, it also says that version is always the last part of the URL, so if I'm understanding correctly, I don't need to try to parse every segment and check if it's a valid semver... just the last segment is enough, right?
There was a problem hiding this comment.
I don't think the documentation on schema_url is relevant here, because in this context Weaver does not derive a version from a schema_url, but from a semantic conventions registry (local, Git, or archive). For recent registries, Weaver retrieves the version directly from the manifest file, but for older registries this manifest does not exist, and in that case the version is derived from the archive path, for example. See the code of the from_semconv_specs.
There was a problem hiding this comment.
I agree we need to be flexible. What I'm saying is that eventually I'd like for SchemaURL to work. Right now it's forcing the v as required, I think we should allow it as optional.
There was a problem hiding this comment.
I'm a bit lost here if in this PR we want to solve v being optional or not 😅
There was a problem hiding this comment.
If it wasn't optional before, you can leave it as is for now, I'll clean that up in a follow up.
For context - I'm working on new resolution specification where we can resolve "resolved schema" directly from a Schema URL - so I'll need to either merge in your PR and fix in my branch or have you fix it so my local fix doesn't get blown away.
I'd rather merge yours in and fix later if you're nervous about the amount of rust.
lquerel
left a comment
There was a problem hiding this comment.
We should change the tests to validate with the real archive URLs.
| /// | ||
| /// * `Some(String)` - The version string including the 'v' prefix (e.g., "v1.26.0") | ||
| /// * `None` - If no valid semantic version is found in the path | ||
| fn extract_semconv_version(path: &str) -> Option<String> { |
There was a problem hiding this comment.
I don't think the documentation on schema_url is relevant here, because in this context Weaver does not derive a version from a schema_url, but from a semantic conventions registry (local, Git, or archive). For recent registries, Weaver retrieves the version directly from the manifest file, but for older registries this manifest does not exist, and in that case the version is derived from the archive path, for example. See the code of the from_semconv_specs.
| // Test with GitHub release URL pattern | ||
| assert_eq!( | ||
| extract_semconv_version( | ||
| "https://github.com/open-telemetry/semantic-conventions/releases/download/v1.26.0/semantic-conventions.zip" |
There was a problem hiding this comment.
The SemConv archive URLs have the following format. We should test with them.
https://github.com/open-telemetry/semantic-conventions/archive/refs/tags/v1.38.0.zip
https://github.com/open-telemetry/semantic-conventions/archive/refs/tags/v1.38.0.tar.gz
There was a problem hiding this comment.
Do we have a list of file format suffixes that Weaver allows? Semver won't be able to parse those suffixes, so I believe I'll have to manually remove this suffixes before parsing the version
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Fixes #579
Trying to learn Rust with some good first issues :)