Using ANTLR4 to report parsing errors #1207

bugarela · 2023-10-05T18:01:13Z

bugarela
Oct 5, 2023
Maintainer

I learned some new stuff about ANTLR today and I'm disappointed with some limitations it has on error reporting. Therefore, I'm inclined to strop trying to use ANTRL for error reporting and, instead, detect errors at the ToIrListener level and report them with a more flexible infrastructure. I want to know your opinions on this, specially @shonfeder's, who is often a defender of keeping parsing responsibilities in the parser.

I found 2 problems:

Problem 1: error range

the offendingSymbol of an ANTLR error is always a Token, so I cannot highlight the proper section of the error if it is longer than one single token. For example, this is what I'm trying to do

/home/gabriela/projects/quint/local/test.qnt:49:16 - error: QNT009: missing parameters in definition to foo. If there are no parameters, you should omit the parentheses.
49:   pure def foo(): bool = {
                  ^^

But I can make it only

49:   pure def foo(): bool = {
                   ^

This is the rule to match and report this error:

LPAREN RPAREN (':' type)? {
  const m = "QNT009: missing parameters in definition to " + $normalCallName.text + ". If there are no parameters, you should omit the parentheses."
  this.notifyErrorListeners(m, $RPAREN, undefined)
}

I tried assigning LPAREN RPAREN to a rule called emptyParenthesis, but I cannot call notifyErrorListeners with a rule, only with a token. I spent some time googling and asking chat GPT about solutions to this, and found nothing better than trying to fix the range later, on the listener itself. For example, in this case we could do "if message starts with QNT009, highlight one more character" - but that sucks.

I really appreciate proper highlights for errors, as I personally try to guess the error from the highlight even before reading the message.

Even in the existing errors we have, this problem is present. See the example in #1206 (comment)

Problem 2: lack of metadata

I also couldn't find a way to add some extra payload with information to the error, beyond the string message. I wanted to add a payload so we could add some fix information, i.e. this "you should omit the parentheses" tip, we could include a fix: replace "()" with "" to be run as a quick fix in vscode, like we do with modes:

quint/quint/src/effects/modeChecker.ts

Lines 75 to 84 in 7df9ffa

    
                 this.errors.set(def.id, { 
        
                   code: 'QNT200', 
        
                   message: `${qualifierToString(def.qualifier)} operators ${modeConstraint(def.qualifier)}, but operator \`${ 
        
                     def.name 
        
                   }\` ${explanation}. Use ${qualifierToString(mode)} instead.`, 
        
                   reference: def.id, 
        
                   data: { 
        
                     fix: { kind: 'replace', original: qualifierToString(def.qualifier), replacement: qualifierToString(mode) }, 
        
                   }, 
        
                 })

If we want to offer nice features like this, we'd have to extract all the information from the message string.

Discussion question

Should we give up on improving error messages inside ANTLR and, instead, do that in the ToIrListener where we can provide all the data and ranges we need?

konnov · 2023-10-10T18:44:34Z

konnov
Oct 10, 2023

It's true that it's not easy to do error reporting in antlr4. On the other hand, it is to late for detecting some of the syntax errors in ToIrListener. If we were able to detect some errors there, that would mean that the parser was too permissive, meaning that it would admit more token sequences than it should.

0 replies

konnov · 2023-10-10T19:00:30Z

konnov
Oct 10, 2023

Looking at "The Definitive ANTLR 4 Reference" (ch. 9), there is an option to define your own error listener by extending BaseErrorListener. It seems to be quite powerful, e.g., you could find the previous occurrence of { there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using ANTLR4 to report parsing errors #1207

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Using ANTLR4 to report parsing errors #1207

bugarela Oct 5, 2023 Maintainer

Problem 1: error range

Problem 2: lack of metadata

Discussion question

Replies: 2 comments

konnov Oct 10, 2023

konnov Oct 10, 2023

bugarela
Oct 5, 2023
Maintainer

konnov
Oct 10, 2023

konnov
Oct 10, 2023