Interegular integration, allowing checks for intersections between regexes. #715

MegaIng · 2020-10-07T19:21:58Z

Fixes #76.

This is a first attempt.

You can install interegular via pip: pip install interegular. If you want, you can also take a look at the interegular source code.

erezsh · 2020-10-09T07:30:41Z

A few notes:

Basically, this is the real functional code (which resides in interregular):

        for a, b in combinations(keys, 2):
            if not self.isdisjoint(a, b):
                yield a, b

I see you already cache the FSMs, so all that's required to prevent duplicate work is to call memoized_disjoint instead.

Why does lark have to call mark and is_marked? Seems unnecessary. If you want to avoid duplicate warnings, just keep a set.
Why compare regexps (expensive) and only afterwards check a.priority == b.priority? This should be the first test, otherwise comparing them with interregular is pointless.
skip_validation means you never use the comparator. So why even create it in the first place?

erezsh · 2020-10-10T16:41:17Z

P.S. re point 3, you can do something like classify(regexps, lambda r: r.priority) to get all the subgroups that should be tested together.

Update __main__.py

Just a minor change

…sue lark-parser#1029)

Added lark syntax highlighting and a few tiny changes.

Since 1.0 isn't Python 2 compatible (according to Reddit post) which makes "& 3" redundant too :)

Generator is memory efficient approach.

Also improve performance of "iter_subtrees_topdown" Performance of "iter_subtrees_topdown" method reduce as size of tree increases. Using instance of list method improve the performance.

Fix EOF line information in InteractiveParser.resume_parse()

Use generator instead of list expand or add method

…n grammars

use fromkeys

Improve logic and performance

Previous version used `_testlist_comp` which allowed for either one `test` or 2 and more `test_or_star_expr` This version allows a list with one `star_expr` which is valid both in Python and in official Python grammar. Moreover it merges rules used in set and list (since those terminals differ only in one thing: set literal cannot be empty)

Updated Python grammar list literal to support `[*x]`

Found via `codespell -L nd,iif,ot,datas`

Fix typos

Examples: Update version for PyQt5

Support for Python-style comments in Lark grammar

[M:grammar.md] doc: added Python-style comments.

… messages can still be displayed

Fix 1154

…en regexes.

…teregular-integration # Conflicts: # lark/lexer.py # lark/load_grammar.py # setup.py

MegaIng · 2023-03-02T16:18:08Z

Well, this is now a messed up git history.

erezsh and others added 28 commits October 19, 2021 12:08

A bit of cleanup, improve test coverage

4c1cfb2

Refactored Discard() exception into an object.

6bb9e8a

Discard now a value. Fix docs

586c9e0

More cleanup

5471f88

Discard: Use instance singleton instead of object()

9a33716

Merge pull request lark-parser#1020 from lark-parser/cleanup_oct2021

0953036

Fix indentation

2ba5831

Merge pull request lark-parser#1021 from lark-parser/return_discard

c5581d8

Create CNAME

30bfedb

Delete CNAME

0667d0a

Docs: Changed IDE url to lark-parser.org

7c17739

Create python.lark

553eb41

Create test_python_grammar.py

518bd3a

Update __main__.py

Merge branch '0dminnimda-Fix-python.number'

2ec9636

Version bump: 1.0

4181e47

Fixed docs ib maybe_placeholders

04ce64b

Just a minor change

Merge pull request lark-parser#1034 from ThatXliner/patch-4

e453803

Added pytest.ini

f2e2413

Removed a few deprecation warnings

98cd022

Merge pull request lark-parser#1035 from lark-parser/issue968

6f65325

Fix for Transformer.__default__ not called in tree-less LALR mode (Is…

413616b

…sue lark-parser#1029)

Fix for previous commit

75ba6b7

Update json_tutorial.md

099c9a6

Added lark syntax highlighting and a few tiny changes.

Remove "Python 2 & 3 compatible"

ab3f77e

Since 1.0 isn't Python 2 compatible (according to Reddit post) which makes "& 3" redundant too :)

Merge pull request lark-parser#1039 from Nightblade/patch-1

6c13384

Updated README.md

538eda7

Update Features page

5f52781

Fix typo: instanciated -> instantiated

57d6460

jmishra01 and others added 27 commits December 5, 2022 07:16

Use generator instead of list expand or add method

0b4280e

Generator is memory efficient approach.

Change some "yield"'s into "yield from"

d76af7c

Also improve performance of "iter_subtrees_topdown" Performance of "iter_subtrees_topdown" method reduce as size of tree increases. Using instance of list method improve the performance.

Use f-string

652b92c

Merge pull request lark-parser#1224 from lark-parser/dec3_ip_last_token

b962bb6

Fix EOF line information in InteractiveParser.resume_parse()

Merge pull request lark-parser#1225 from jmishra01/memory-efficient

c3b2996

Use generator instead of list expand or add method

Fix examples: Remove extend-python example (outdated); other fixes.

031cadd

Version bump (1.1.5)

7d9cfa6

[M:lark.lark & M:load_grammar.py] support for Python-style comments i…

b51726c

…n grammars

Improve logic and performance

2a0bfe3

use fromkeys

Merge pull request lark-parser#1228 from jmishra01/lark-utils

ed2bd92

Improve logic and performance

Merge pull request lark-parser#1232 from evtn/patch-1

f3d7904

Updated Python grammar list literal to support `[*x]`

Fix typos

bf03721

Found via `codespell -L nd,iif,ot,datas`

Merge pull request lark-parser#1242 from kianmeng/fix-typos

a194a50

Fix typos

Examples: Update version for PyQt5

386544f

Merge pull request lark-parser#1243 from lark-parser/jan29

6af71e2

Examples: Update version for PyQt5

Merge pull request lark-parser#1230 from vincent-hugot/master

223af36

Support for Python-style comments in Lark grammar

[M:grammar.md] doc: added Python-style comments.

6aaa01a

[M:grammar.md] mention version

30b6982

Merge pull request lark-parser#1245 from vincent-hugot/doc_comments

c9b697e

[M:grammar.md] doc: added Python-style comments.

Add .raw to be serialized in PatternRE and PatternSTR so that error…

e153825

… messages can still be displayed

Merge branch 'lark-parser:master' into fix-1154

0c06327

Added test for error messages in caching

f845b2e

Merge pull request lark-parser#1252 from MegaIng/fix-1154

2564232

Fix 1154

Added interegular integration, allowing checks for intersection betwe…

0b330b7

…en regexes.

Fix standalone parser

31b34a8

Merge remote-tracking branch 'origin/interegular-integration' into in…

56ec90d

…teregular-integration # Conflicts: # lark/lexer.py # lark/load_grammar.py # setup.py

MegaIng closed this Mar 2, 2023

MegaIng mentioned this pull request Mar 2, 2023

Added interegular support #1258

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interegular integration, allowing checks for intersections between regexes. #715

Interegular integration, allowing checks for intersections between regexes. #715

MegaIng commented Oct 7, 2020

erezsh commented Oct 9, 2020 •

edited

Loading

erezsh commented Oct 10, 2020

MegaIng commented Mar 2, 2023

Interegular integration, allowing checks for intersections between regexes. #715

Interegular integration, allowing checks for intersections between regexes. #715

Conversation

MegaIng commented Oct 7, 2020

erezsh commented Oct 9, 2020 • edited Loading

erezsh commented Oct 10, 2020

MegaIng commented Mar 2, 2023

erezsh commented Oct 9, 2020 •

edited

Loading