Ambiguities and Priorities in 1.2.1 earley #1443

MaxOstrowski · 2024-08-13T16:50:02Z

Since lark version 1.2.1, there seem to be (more?) ambiguities produced for our syntax.
While the syntax is ambiguous, we used prioritiies to fix this before.

%import common.INT
%import common.FLOAT
?start: (condition"."?)?

...

unsigned_integer.1: INT
unsigned_float    : FLOAT

So whenever a condition ends with a number, like:
x = 5.
there are 2 interpretations, either the dot comes from the start rule, and 5 is an INTEGER, or the DOT comes from a float 5. and the dot from the start rule is omitted.
The priority unsigned_integer.1: INT fixed this before.

Is it normal that we still get 2 different parse trees (earley parser)
We implemented the following rule in the transformer to simply select the preferred ast:

    def _ambig(self, args: list[Any]) -> Any:
        """Take most preferred version."""
        return args[0]

but it feels like a big waste of performance that all possible ambiguities are produced.
Is there a way to enforce only producing the most preferred ast?

Might be related to #1441.

The text was updated successfully, but these errors were encountered:

erezsh · 2024-08-13T16:55:41Z

Hi @MaxOstrowski ,

What arguments are you giving to Lark? Specifically as the "ambiguity" parameter?

MaxOstrowski · 2024-08-13T17:11:12Z

We tried all of them, none of them made any difference, so we thought that we might have used it wrong?
Lark.open(file, ambiguity="resolve")

erezsh · 2024-08-13T17:41:28Z

Thank you for reporting this bug.

Looks like the issue is caused by a bugfix I included in v1.2.1.

I just created a PR with a fix: #1444

If you can test it, let me know if it fixed the issue.

I will probably make a new release later today.

erezsh · 2024-08-13T19:53:28Z

Okay, new version released - https://github.com/lark-parser/lark/releases/tag/1.2.2

It should be now fixed. Let me know otherwise.

Sorry for the inconvenience!

erezsh · 2024-08-13T20:04:16Z

Actually I will re-open. Perhaps your problem isn't entirely solved.

P.S. regarding the efficiency concerns - Earley must be aware of all the ambiguities in order to parse the text correctly. Lark stores the ambiguities in the SPPF structure, which is fairly efficient.

I think the final step can be rewritten a bit more efficiently, because we currently produce the trees even for the root solutions that we don't return. It just takes a bit of reshuffling, I'll try and do it tomorrow.

MaxOstrowski · 2024-08-14T11:34:42Z

Regarding our problem, with the newest version (1.2.2) we do not get any ambiguous solutions anymore. Thanks a lot!

May be related: - lark-parser/lark#1443 - lark-parser/lark#1451 - lark-parser/lark#1456

MaxOstrowski added the question label Aug 13, 2024

erezsh mentioned this issue Aug 13, 2024

Bugfix: Earley now respects ambiguity='resolve' again. Bug was introd… #1444

Merged

erezsh closed this as completed Aug 13, 2024

erezsh reopened this Aug 13, 2024

erezsh mentioned this issue Aug 15, 2024

Bugfix Earley: only transform the solutions we yield #1447

Closed

mihailefter added a commit to mutalyzer/hgvs-parser that referenced this issue Aug 27, 2024

Pin lark version due to extra ambiguities in 1.2.1 and 1.2.2

f615827

May be related: - lark-parser/lark#1443 - lark-parser/lark#1451 - lark-parser/lark#1456

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ambiguities and Priorities in 1.2.1 earley #1443

Ambiguities and Priorities in 1.2.1 earley #1443

MaxOstrowski commented Aug 13, 2024

erezsh commented Aug 13, 2024

MaxOstrowski commented Aug 13, 2024

erezsh commented Aug 13, 2024 •

edited

Loading

erezsh commented Aug 13, 2024

erezsh commented Aug 13, 2024

MaxOstrowski commented Aug 14, 2024 •

edited

Loading

Ambiguities and Priorities in 1.2.1 earley #1443

Ambiguities and Priorities in 1.2.1 earley #1443

Comments

MaxOstrowski commented Aug 13, 2024

erezsh commented Aug 13, 2024

MaxOstrowski commented Aug 13, 2024

erezsh commented Aug 13, 2024 • edited Loading

erezsh commented Aug 13, 2024

erezsh commented Aug 13, 2024

MaxOstrowski commented Aug 14, 2024 • edited Loading

erezsh commented Aug 13, 2024 •

edited

Loading

MaxOstrowski commented Aug 14, 2024 •

edited

Loading