🐛Fix "literal not terminated" error for markers with unpaired backticks in Pattern #1376
🐛Fix "literal not terminated" error for markers with unpaired backticks in Pattern #1376dongjiang1989 wants to merge 3 commits into
Conversation
Signed-off-by: dongjiang <dongjiang1989@126.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dongjiang1989 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| // report "literal not terminated". To avoid that error for annotations that may | ||
| // include unpaired backticks, only enable ScanRawStrings when the number of | ||
| // backticks is even (i.e. balanced). | ||
| if strings.Count(raw, "`")%2 == 0 { |
There was a problem hiding this comment.
But doesn't that mean we will always allow unpaired backticks, even in situations where that is incorrect?
There was a problem hiding this comment.
Thanks @alvaroaleman
This is how the current PR is implemented.
Should we support such unpaired backticks, or throw an error directly?
Personally, I think scenarios with unpaired backticks do exist. WDYT?
camilamacedo86
left a comment
There was a problem hiding this comment.
Thanks for working on this !! Really cool 🎉
I have a concern about the backtick counting approach. I don’t think it’s very reliable, because counting even/odd backticks doesn’t actually tell us whether they are properly paired.
And I added some comments I think we can refine this one a little better to ensure that we properly address the scenario.
| @@ -812,7 +812,20 @@ func (d *Definition) loadFields() error { | |||
| func parserScanner(raw string, err func(*sc.Scanner, string)) *sc.Scanner { | |||
There was a problem hiding this comment.
I have some concerns about this approach, see:
This has two backticks (even), so the heuristic enables ScanRawStrings, but these are just literal characters inside the regex — not a real raw string. This can lead to incorrect parsing or different errors. The main issue is that the heuristic can't distinguish between:
- Actual raw strings:
^[a-z]+$(backticks as delimiters) - Literal backticks: ^[
a-z]+$ (backticks as regex characters)
Example:
```go
// Pattern: ^[`a-zA-Z`]+$
// Count: 2 backticks (even) the ScanRawStrings enabled
// Scanner tries to parse: `a-zA-Z` as a raw string
So, It seems that we are not solving the issue right?
(Proposed Solution) Therefore, WDYT:
Instead of guessing intent by counting, we can use error recovery to let the
scanner tell us what works:
- Try parsing with
ScanRawStringsenabled - If we get a “literal not terminated” error, retry without it
Something like
```go
func (d *Definition) Parse(rawMarker string) (any, error) {
name, anonName, fields := splitMarker(rawMarker)
// Try with ScanRawStrings enabled (supports raw strings like `pattern`)
result, err := d.parseWithScanner(name, anonName, fields, true)
// Retry without ScanRawStrings if we get "literal not terminated"
if err != nil && strings.Contains(err.Error(), "literal not terminated") {
result, err = d.parseWithScanner(name, anonName, fields, false)
}
return result, err
}
This way we rely on the scanner itself and then I think we have here all addressed:
- Handles all cases correctly (paired, unpaired, quoted, etc.)
- It might make easier to understand how it works ( improve readability )
- I think it will be probably more robust than a heuristic
There was a problem hiding this comment.
Thanks @camilamacedo86
Please re-check it
Change:
- parserScanner — replaced the backtick-counting heuristic with a simple rawStrings bool parameter
- Parse + parse — extracted parsing logic into parse(rawMarker, rawStrings) helper; Parse now tries with rawStrings=true first, then retries with false if "literal not
terminated" error occurs - hasTerminatedErr — new helper to detect the specific error that triggers fallback
- guessType — updated two internal parserScanner calls to pass true (preserves original behavior)
Signed-off-by: dongjiang <dongjiang1989@126.com>
Signed-off-by: dongjiang <dongjiang1989@126.com>
|
Thanks @camilamacedo86 Please re-check it |
camilamacedo86
left a comment
There was a problem hiding this comment.
That is amazing 🎉
Great job.
I just add a few nit comments
After those I am happy to LGTM / approve
| @@ -0,0 +1,150 @@ | |||
| /* | |||
There was a problem hiding this comment.
I think we could add a test like
It("should handle full marker parsing with backtick retry", func() {
// Test the complete Parse() flow, not just parserScanner
registry := &Registry{}
def := Must(MakeDefinition("kubebuilder:validation:Pattern", DescribesField, ""))
registry.Register(def)
// This should trigger the retry mechanism internally
result, err := def.Parse("+kubebuilder:validation:Pattern=^[`a-zA-Z]+$")
Expect(err).NotTo(HaveOccurred())
Expect(result).NotTo(BeNil())
})
That would check both
Could you please supplement?
| @@ -826,6 +830,25 @@ type markerParser interface { | |||
| // raw marker in the form `+a:b:c=arg,d=arg` into an output object of the | |||
| // type specified in the definition. | |||
| func (d *Definition) Parse(rawMarker string) (any, error) { | |||
There was a problem hiding this comment.
I think we need improve the description of it now, such as:
// Parse uses the type information in this Definition to parse the given
// raw marker in the form `+a:b:c=arg,d=arg` into an output object of the
// type specified in the definition.
//
// Parse attempts to support both raw strings (backtick-delimited) and literal
// backtick characters by retrying without raw string support if parsing fails
// with "literal not terminated" error.
| // may be literal characters (e.g. inside a regex character class), not raw string | ||
| // delimiters. | ||
| if hasTerminatedErr(errs) { | ||
| result, errs = d.parse(rawMarker, false) |
There was a problem hiding this comment.
What about we return the error here when fails, such as
if hasTerminatedErr(errs) {
result, retryErr := d.parse(rawMarker, false)
if retryErr != nil {
return result, fmt.Errorf("parse failed (tried with and without raw strings): %w",
retryErr)
}
return result, nil
}
| if err == nil { | ||
| return false | ||
| } | ||
| return strings.Contains(err.Error(), "literal not terminated") |
There was a problem hiding this comment.
I think we can improve here such as
// Check for the specific scanner error, not just string matching
errStr := err.Error()
return strings.Contains(errStr, "literal not terminated") ||
strings.Contains(errStr, "string not terminated")
Summary
Fixes the scanner error "literal not terminated" when parsing markers that contain an unpaired backtick in their pattern values, such as:
// +kubebuilder:validation:items:Pattern=^[`a-zA-Z]+$Closes: #1084
Changes
pkg/markers/parse.go- Modified parserScanner function to:-
parserScanner— replaced the backtick-counting heuristic with a simple rawStrings bool parameter-
Parse + parsefunction — extracted parsing logic into parse(rawMarker, rawStrings) helper; Parse now tries with rawStrings=true first, then retries with false if "literal notterminated" error occurs
- hasTerminatedErr — new helper to detect the specific error that triggers fallback
- guessType — updated two internal parserScanner calls to pass true (preserves original behavior)
pkg/markers/parse_internal_test.go( add new file) - Added test cases covering:- Updated all parserScanner calls to include the new rawStrings parameter
- Fixed type mismatch in assertions (uint vs int)
- Added test case for two literal backticks inside regex character class (addresses camilamacedo86's concern)
- Removed misleading "balanced" terminology from comments