Skip to content

🐛Fix "literal not terminated" error for markers with unpaired backticks in Pattern #1376

Open
dongjiang1989 wants to merge 3 commits into
kubernetes-sigs:mainfrom
dongjiang1989:update-pattern
Open

🐛Fix "literal not terminated" error for markers with unpaired backticks in Pattern #1376
dongjiang1989 wants to merge 3 commits into
kubernetes-sigs:mainfrom
dongjiang1989:update-pattern

Conversation

@dongjiang1989
Copy link
Copy Markdown
Member

@dongjiang1989 dongjiang1989 commented Apr 9, 2026

Summary

Fixes the scanner error "literal not terminated" when parsing markers that contain an unpaired backtick in their pattern values, such as:

  // +kubebuilder:validation:items:Pattern=^[`a-zA-Z]+$                                                                                                      

Closes: #1084

Changes

  1. pkg/markers/parse.go - Modified parserScanner function to:
    - parserScanner — replaced the backtick-counting heuristic with a simple rawStrings bool parameter
    - Parse + parse function — extracted parsing logic into parse(rawMarker, rawStrings) helper; Parse now tries with rawStrings=true first, then retries with false if "literal not
    terminated" error occurs
    - hasTerminatedErr — new helper to detect the specific error that triggers fallback
    - guessType — updated two internal parserScanner calls to pass true (preserves original behavior)

  2. pkg/markers/parse_internal_test.go ( add new file) - Added test cases covering:
    - Updated all parserScanner calls to include the new rawStrings parameter
    - Fixed type mismatch in assertions (uint vs int)
    - Added test case for two literal backticks inside regex character class (addresses camilamacedo86's concern)
    - Removed misleading "balanced" terminology from comments

Signed-off-by: dongjiang <dongjiang1989@126.com>
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dongjiang1989
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 9, 2026
@dongjiang1989 dongjiang1989 changed the title Fix "literal not terminated" error for markers with unpaired backticks in Pattern 🐛Fix "literal not terminated" error for markers with unpaired backticks in Pattern Apr 9, 2026
Comment thread pkg/markers/parse.go Outdated
// report "literal not terminated". To avoid that error for annotations that may
// include unpaired backticks, only enable ScanRawStrings when the number of
// backticks is even (i.e. balanced).
if strings.Count(raw, "`")%2 == 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't that mean we will always allow unpaired backticks, even in situations where that is incorrect?

Copy link
Copy Markdown
Member Author

@dongjiang1989 dongjiang1989 Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alvaroaleman

This is how the current PR is implemented.
Should we support such unpaired backticks, or throw an error directly?
Personally, I think scenarios with unpaired backticks do exist. WDYT?

Copy link
Copy Markdown
Member

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this !! Really cool 🎉

I have a concern about the backtick counting approach. I don’t think it’s very reliable, because counting even/odd backticks doesn’t actually tell us whether they are properly paired.

And I added some comments I think we can refine this one a little better to ensure that we properly address the scenario.

Comment thread pkg/markers/parse.go Outdated
Comment thread pkg/markers/parse_internal_test.go Outdated
Comment thread pkg/markers/parse_internal_test.go Outdated
Comment thread pkg/markers/parse_internal_test.go Outdated
Comment thread pkg/markers/parse_internal_test.go
Comment thread pkg/markers/parse.go Outdated
@@ -812,7 +812,20 @@ func (d *Definition) loadFields() error {
func parserScanner(raw string, err func(*sc.Scanner, string)) *sc.Scanner {
Copy link
Copy Markdown
Member

@camilamacedo86 camilamacedo86 Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns about this approach, see:

This has two backticks (even), so the heuristic enables ScanRawStrings, but these are just literal characters inside the regex — not a real raw string. This can lead to incorrect parsing or different errors. The main issue is that the heuristic can't distinguish between:

  • Actual raw strings: ^[a-z]+$ (backticks as delimiters)
  • Literal backticks: ^[a-z]+$ (backticks as regex characters)

Example:

  ```go                                                                         
  // Pattern: ^[`a-zA-Z`]+$                                 
  // Count: 2 backticks (even) the ScanRawStrings enabled                         
  // Scanner tries to parse: `a-zA-Z` as a raw string 

So, It seems that we are not solving the issue right?

(Proposed Solution) Therefore, WDYT:

Instead of guessing intent by counting, we can use error recovery to let the
scanner tell us what works:

  1. Try parsing with ScanRawStrings enabled
  2. If we get a “literal not terminated” error, retry without it

Something like

```go
  func (d *Definition) Parse(rawMarker string) (any, error) {
      name, anonName, fields := splitMarker(rawMarker)                          
                                                                                
      // Try with ScanRawStrings enabled (supports raw strings like `pattern`)  
      result, err := d.parseWithScanner(name, anonName, fields, true)           
                                                                                
      // Retry without ScanRawStrings if we get "literal not terminated"
      if err != nil && strings.Contains(err.Error(), "literal not terminated") {
          result, err = d.parseWithScanner(name, anonName, fields, false)       
      }
                                                                                
      return result, err                                    
  }

This way we rely on the scanner itself and then I think we have here all addressed:

  • Handles all cases correctly (paired, unpaired, quoted, etc.)
  • It might make easier to understand how it works ( improve readability )
  • I think it will be probably more robust than a heuristic

Copy link
Copy Markdown
Member Author

@dongjiang1989 dongjiang1989 Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @camilamacedo86

Please re-check it

Change:

  1. parserScanner — replaced the backtick-counting heuristic with a simple rawStrings bool parameter
  2. Parse + parse — extracted parsing logic into parse(rawMarker, rawStrings) helper; Parse now tries with rawStrings=true first, then retries with false if "literal not
    terminated" error occurs
  3. hasTerminatedErr — new helper to detect the specific error that triggers fallback
  4. guessType — updated two internal parserScanner calls to pass true (preserves original behavior)

Signed-off-by: dongjiang <dongjiang1989@126.com>
Signed-off-by: dongjiang <dongjiang1989@126.com>
@dongjiang1989
Copy link
Copy Markdown
Member Author

Thanks @camilamacedo86 Please re-check it

Copy link
Copy Markdown
Member

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is amazing 🎉
Great job.

I just add a few nit comments
After those I am happy to LGTM / approve

@@ -0,0 +1,150 @@
/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could add a test like

It("should handle full marker parsing with backtick retry", func() {
      // Test the complete Parse() flow, not just parserScanner                                 
      registry := &Registry{}                                                                   
      def := Must(MakeDefinition("kubebuilder:validation:Pattern", DescribesField, ""))         
      registry.Register(def)                                                                    
                                                                                                
      // This should trigger the retry mechanism internally                                     
      result, err := def.Parse("+kubebuilder:validation:Pattern=^[`a-zA-Z]+$")                  
      Expect(err).NotTo(HaveOccurred())                                                         
      Expect(result).NotTo(BeNil())                                                             
  })              

That would check both
Could you please supplement?

Comment thread pkg/markers/parse.go
@@ -826,6 +830,25 @@ type markerParser interface {
// raw marker in the form `+a:b:c=arg,d=arg` into an output object of the
// type specified in the definition.
func (d *Definition) Parse(rawMarker string) (any, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need improve the description of it now, such as:

  // Parse uses the type information in this Definition to parse the given
  // raw marker in the form `+a:b:c=arg,d=arg` into an output object of the                     
  // type specified in the definition.                                                          
  //                                                                                            
  // Parse attempts to support both raw strings (backtick-delimited) and literal                
  // backtick characters by retrying without raw string support if parsing fails                
  // with "literal not terminated" error.

Comment thread pkg/markers/parse.go
// may be literal characters (e.g. inside a regex character class), not raw string
// delimiters.
if hasTerminatedErr(errs) {
result, errs = d.parse(rawMarker, false)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about we return the error here when fails, such as

  if hasTerminatedErr(errs) {                               
      result, retryErr := d.parse(rawMarker, false)                                             
      if retryErr != nil {                                                                      
          return result, fmt.Errorf("parse failed (tried with and without raw strings): %w",    
  retryErr)                                                                                     
      }                                                     
      return result, nil                                                                        
  } 

Comment thread pkg/markers/parse.go
if err == nil {
return false
}
return strings.Contains(err.Error(), "literal not terminated")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can improve here such as

// Check for the specific scanner error, not just string matching
      errStr := err.Error()                                                                     
      return strings.Contains(errStr, "literal not terminated") ||
             strings.Contains(errStr, "string not terminated") 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generating manifest with backtick in pattern

4 participants