-
-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #696 now providing the correct amount of placeholders #922
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a lot shorter of a PR than I expected. Good job!
lark/load_grammar.py
Outdated
@@ -231,12 +231,11 @@ def will_not_get_removed(sym): | |||
return not sym.name.startswith('_') | |||
if isinstance(sym, Terminal): | |||
return keep_all_tokens or not sym.filter_out | |||
assert False | |||
if isinstance(sym, Symbol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed. There are only two Symbols subclasses: NonTerminal
and Terminal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now _EMPTY
is a Symbol, and isn't a terminal or nonterminal.
But it probably makes more sense to test is _EMPTY
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea testing for is _EMPTY
would make more sense. i went with isinstance(sym, Symbol)
because Symbol
is only ever called directly when defining _EMPTY
, but it's less clear and less precise than testing for is _EMPTY
.
@@ -231,12 +231,11 @@ def will_not_get_removed(sym): | |||
return not sym.name.startswith('_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to think about changing this behavior. But I am not sure if there is a better solution...
@MegaIng I'm surprised you aren't concerned with breaking backwards-compatibility. I think I'll have to add a new value for this behavior. Maybe |
@erezsh True. I didn't consider it. This is obviously breaking. We can add a new option, or push it back till 1.0 . (on the other hand, I find the current system so confusing, we might as well break in the next 0.x.1 release.) |
Yes, definitely. I think |
@ThatXliner Can you explain your reasoning? The issue is that a match and a mismatch may provide a different amount of arguments. |
Well the docs said that
Yeah I saw. You originally thought of it being the right behavior, right? The point is, I agree that we should change it to |
@ThatXliner Our current plan is to add this change with a few other in a very breaking version 1.0 . |
On the issue of changing behavior, you might want to replace the |
Oh ok. In that case, you might want to update the docs, as well. |
That's a great idea! thanks for the suggestion. |
P.S. the question is, should we even support the old behavior in 1.x, or just keep the new one? |
Ok, so I here's an example of using the old behavior for conciseness: https://github.com/ThatXliner/tylaireum/blob/4ae6772a0660cb75c307edc7d2bdff94667b8cdc/tylaireum/grammar.lark#L59 If I wanted to be compatible with the new behavior, I'd need to make another rule. (the |
@ThatXliner Maybe you misunderstood. |
Oh right. Do terminals starting with |
i've encountered a usecase where my fix doesn't suffice. p = Lark("""!start: ["a"] | ["b"] ["c"] """, maybe_placeholders=True)
p.parse("").children should return i've taken a look at what's causing this, and from what i can tell this looks like the fix is gonna be much more complex than my previous one. i haven't had the time to dive into it, and it's likely i'm gonna have some trouble with it. |
I disagree. While 2 children is the maximal account that can occur, it can also return just one (just This is also especially none-bad since a simple rewrite of the rule to |
(i edited my previous comment, the outer |
@ornariece I'm inclined to rebase this PR over v1.0 instead, and then merge it. Any objections? If you want it to get to the current master, we can try the |
fine by me! |
another example where the current fix doesn't suffice, this time the grammar should be minimal: one workaround would be to add redundancy to the rule: |
@ornariece I would say that's the intended behavior. In a rule like Perhaps one could argue that this balancing should happen automatically by adding Nones or some other way. But that's a whole separate issue in Lark, and not related to this specific fix. |
hmmm true, i didn't think of it as a separate behavior. one could argue |
The feel free to make the argument, because I think it's not too hard to implement. |
well the thing is you made it so that only items enclosed in |
If you want |
the only assumption i'm expecting is "if the user set |
ok no this is not the only assumption, since you'd still want |
If is the user responsibility, make them express it in the grammar. Also, what about |
i agree with that
good point. the same issue arises for |
Thanks, and sorry it took so long! |
@ornariece No. What if |
well same issue as with |
now returns
[None, None, None]
instead of[None]
same for
!start: ["a" ["b" "c"]]
.