Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
123 commits
Select commit Hold shift + click to select a range
4ceeb24
feat: add preprocessing based parser
jnoortheen Apr 14, 2023
bf41839
test: update sample test
jnoortheen Apr 14, 2023
159c018
test: rename basic sanity tests
jnoortheen Apr 15, 2023
bd3553a
test: add test files from xonsh repo
jnoortheen Apr 15, 2023
6546aef
feat: add tests from xonsh repo
jnoortheen Apr 19, 2023
44d6df3
feat: support to writing to Python lr-tables
jnoortheen Mar 30, 2024
6650e3e
feat: exp-1 initial sizes
jnoortheen Mar 30, 2024
03d6cbc
refactor: merge overridden actions to base class
jnoortheen Mar 30, 2024
2e6c7e8
feat: add benchmark for different type of tables
jnoortheen Mar 30, 2024
fb9b34f
feat: able to load multiple format lr-tables
jnoortheen Mar 30, 2024
d004ba3
feat: add mypyc compiled data format
jnoortheen Mar 30, 2024
dbc6d3c
feat: add mypyc pickle
jnoortheen Mar 30, 2024
d115a2a
refactor: accept str path to load parser
jnoortheen Mar 30, 2024
55b6959
test: add pre-processor based tests
jnoortheen Mar 30, 2024
38782dc
chore: upgrade pre-commit plugins
jnoortheen Apr 24, 2024
aefb4a3
feat: update tokenizer changes from xonsh v0.16.0
jnoortheen Apr 24, 2024
7130d85
feat: add peg_parser from parser
jnoortheen Apr 24, 2024
cd8ac15
chore: update ruff settings
jnoortheen Apr 24, 2024
41c2e74
feat: add pegen project files
jnoortheen Apr 24, 2024
91ed338
chore: remove taskfiles
jnoortheen Apr 24, 2024
5cb8e5f
refactor: overwrite header
jnoortheen Apr 24, 2024
86a5611
feat: add tests from pegen site
jnoortheen Apr 24, 2024
8bbb01f
fix: update tests of older versions than py39
jnoortheen Apr 24, 2024
bab0990
feat: move towards custom tokenizer
jnoortheen Apr 24, 2024
e490856
refactor: ruff style
jnoortheen Apr 24, 2024
fae6185
feat: add xonsh tokenize
jnoortheen Apr 24, 2024
d0aa01b
refactor: move tokens
jnoortheen Apr 24, 2024
c3bd695
refactor: copy tokenize from python stdlib v310
jnoortheen Apr 24, 2024
e5c580b
feat: use tokenizer from package
jnoortheen Apr 24, 2024
95a706a
feat: include tests from xonsh
jnoortheen Apr 25, 2024
1172502
test: update tests to use own tokenizer
jnoortheen Apr 25, 2024
ca0a2ef
refactor: update tokenizer code
jnoortheen Apr 25, 2024
d9b3529
refactor: move ply tests
jnoortheen Apr 25, 2024
aa8a379
feat: simplify tokenize.py
jnoortheen Apr 27, 2024
22f6ecc
chore: update mypy config
jnoortheen Apr 27, 2024
20f23ea
feat: implement path literals
jnoortheen Apr 27, 2024
b098721
feat: handle env names in tokens
jnoortheen Apr 28, 2024
2dff67e
feat: add pegen from CPython/Tools
jnoortheen Apr 28, 2024
601f642
feat: pass custom token set to PythonParserGenerator
jnoortheen Apr 28, 2024
1b61f10
test: rerun parse if previous failed
jnoortheen Apr 29, 2024
058ff64
refactor: move parse methods to class
jnoortheen Apr 29, 2024
f591f0c
chore: add flask deps for pegen-web module
jnoortheen Apr 29, 2024
5408af3
refactor: adding from we-like-parsers/pegen
jnoortheen Apr 29, 2024
1b03f61
refactor: adding from we-like-parsers/pegen
jnoortheen Apr 29, 2024
020c163
chore: update tasks
jnoortheen Apr 29, 2024
75f3f59
feat: use taskfile.yml with source watch
jnoortheen Apr 29, 2024
131a6b8
feat: handle loading ply parser
jnoortheen Apr 29, 2024
703edf1
refactor: make parser py39+ and optimize imports
jnoortheen Apr 29, 2024
8139f2f
chore: add ipython
jnoortheen Apr 30, 2024
251db14
feat: handle tokens separately in generator
jnoortheen Apr 30, 2024
b7525ba
fix: deprecation warning ast.Str
jnoortheen Apr 30, 2024
fc67b5a
fix: deprecation warning ast.Str
jnoortheen Apr 30, 2024
1e28cfb
feat: make parser py39+
jnoortheen Apr 30, 2024
686e0c5
feat: implement parsing $env vars
jnoortheen Apr 30, 2024
1ab7f76
chore: use single xonsh.gram
jnoortheen Apr 30, 2024
f4cdc72
fix: deprecation warning
jnoortheen Apr 30, 2024
c1a4df6
test: update tests and fix mypy errors
jnoortheen Apr 30, 2024
06ed906
test: fix test errors/fails of missing fixtures
jnoortheen Apr 30, 2024
e49f229
feat: add fstring tokens from py3.12
jnoortheen Apr 30, 2024
1953343
feat: add tokenize code for untokenizer from py312 stdlib
jnoortheen Apr 30, 2024
71b55bd
feat: support py311 & py312
jnoortheen May 1, 2024
f11c8d6
chore: update tests
jnoortheen May 1, 2024
b77e5c7
refactor: remove ply based parser dir
jnoortheen May 1, 2024
8254611
refactor: move untokenize to its own module
jnoortheen May 1, 2024
d29841d
refactor: simplify tokenize.py with states
jnoortheen May 1, 2024
091c0e0
refactor: simplify tokenize.py further
jnoortheen May 1, 2024
2918296
test: update tests to mark xfail xonsh tokens
jnoortheen May 1, 2024
4c48178
fix: implement parenthesis level for xonsh tokens
jnoortheen May 2, 2024
12c4cdb
refactor: update xonsh token names
jnoortheen May 2, 2024
3f84a11
feat: tokenize xonsh operators separately
jnoortheen May 2, 2024
cb4a1aa
feat: implement env names and env expressions
jnoortheen May 2, 2024
1b8e7ed
test: update tests for the tokenizer
jnoortheen May 2, 2024
ea1b091
refactor: move tests out of package
jnoortheen May 2, 2024
a36b2b4
fix: store env variable case
jnoortheen May 2, 2024
b03ea15
chore: add task to test
jnoortheen May 2, 2024
0acc295
feat: add $() handling simple cases
jnoortheen May 2, 2024
6a08262
feat: tokenize search-path
jnoortheen May 2, 2024
3585a7e
refactor: update ${..} handling
jnoortheen May 3, 2024
ac7ff32
refactor: enable more ruff plugins
jnoortheen May 4, 2024
1066b7c
test: remove pure python tests
jnoortheen May 4, 2024
d310061
chore: add pytest-testmondata
jnoortheen May 4, 2024
72b959c
test: move test cases to files
jnoortheen May 4, 2024
ac8d8f1
test: update parser tests
jnoortheen May 4, 2024
804b744
chore: update tasks
jnoortheen May 4, 2024
481f1a1
feat: implement splitting by WS/NL
jnoortheen May 4, 2024
7dd5ebf
feat: implement !(), ![], $[] operators
jnoortheen May 4, 2024
2a798ec
feat: implement @$() - subproc_injection
jnoortheen May 4, 2024
59ef258
test: fix test data
jnoortheen May 4, 2024
0560593
feat: implement @() - python-expr operator
jnoortheen May 4, 2024
d6f606a
feat: handle adjacent replacement and pass as *cmds
jnoortheen May 4, 2024
5540f1d
docs: add todo items
jnoortheen May 5, 2024
4032d03
feat: implement help? syntax
jnoortheen May 5, 2024
466534e
test: tidy test cases
jnoortheen May 5, 2024
7792937
test: organize tests
jnoortheen May 5, 2024
b88f11e
refactor: update tokenizer
jnoortheen May 5, 2024
0fd8d38
feat: implement path-search regexes
jnoortheen May 5, 2024
e779dd4
test: organize tests data
jnoortheen May 5, 2024
80bfca6
docs: mark pegen checking done
jnoortheen May 5, 2024
9d25f1c
feat: implement `&&`, `||` combinators
jnoortheen Apr 14, 2023
df63dea
feat: make whole test suit pass or xfail
jnoortheen May 7, 2024
2985625
feat: implement macros basic level
jnoortheen May 7, 2024
608661c
refactor: cleanup tokenizer
jnoortheen May 11, 2024
ba8d009
test: post verbose parser output upon first 3 fails
jnoortheen May 11, 2024
f13de65
feat: ability to accept hard keywords in macros
jnoortheen May 11, 2024
a2b6632
feat: tokenize whitespaces/Operators as their own Tokens instead of OP
jnoortheen May 13, 2024
a10bef9
feat: handle macro parameters with whitespace
jnoortheen May 13, 2024
edf37bc
feat: handle parenthesis inside macros
jnoortheen May 13, 2024
7281482
refactor: code cleanup
jnoortheen May 13, 2024
99c22fc
test: now test_invalid works
jnoortheen May 15, 2024
da1b2e2
test: changes in lexer tests operator token handling change
jnoortheen May 15, 2024
a321ff4
refactor: handle OP tokens separately
jnoortheen May 15, 2024
c306501
docs: update todos
jnoortheen May 15, 2024
b0620c1
fix: import Target for del tests
jnoortheen May 15, 2024
322fe72
fix: handle sub procs regression fails
jnoortheen May 17, 2024
73c898d
test: more passing tests
jnoortheen May 17, 2024
07c5c9f
test: update tests
jnoortheen May 18, 2024
ef5a2a8
feat: enable handling subproc macros
jnoortheen May 18, 2024
fe1ed90
refactor: cleanup functions
jnoortheen May 19, 2024
2165c3b
feat: implement with-macros single indent
jnoortheen May 19, 2024
c0fc11b
feat: implement with macro multi indents
jnoortheen May 19, 2024
70d2ffd
fix: clash between proc and with macros
jnoortheen May 19, 2024
1945b05
chore: setup github actions
jnoortheen May 20, 2024
e9576d7
chore: use latest pip in CI
jnoortheen May 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 27 additions & 14 deletions .asv/results/benchmarks.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,26 @@
"name": "benchmarks.MemSuite.mem_parser_init",
"param_names": [],
"params": [],
"timeout": 60.0,
"type": "memory",
"unit": "bytes",
"version": "fef8294dd8f9528bab51728ace954f992f353699e8c20ddba9cb190744896302"
},
"benchmarks.PeakMemSuite.peakmem_parser_init": {
"code": "class PeakMemSuite:\n def peakmem_parser_init(self):\n from xonsh_parser.parser import get_parser_cls\n \n parser = get_parser_cls()()\n parser.parse(\"ls -alh\")\n\n def setup(self):\n from xonsh_parser.parser import write_parser_table\n write_parser_table()",
"name": "benchmarks.PeakMemSuite.peakmem_parser_init",
"param_names": [],
"params": [],
"timeout": 60.0,
"benchmarks.PeakMemSuite.peakmem_parser_init_": {
"code": "class PeakMemSuite:\n def peakmem_parser_init_(self, f):\n \n from xonsh_parser.parser import get_parser_cls\n \n parser = get_parser_cls()(parser_table=Path(f))\n parser.parse(\"ls -alh\")\n\n def setup(self, f):\n from xonsh_parser.parser import write_parser_table\n write_parser_table(output_path=f)",
"name": "benchmarks.PeakMemSuite.peakmem_parser_init_",
"param_names": [
"param1"
],
"params": [
[
"'/tmp/xonsh-lr-table.pickle'",
"'/tmp/xonsh-lr-table.py'",
"'/tmp/xonsh-lr-table.jsonl'"
]
],
"type": "peakmemory",
"unit": "bytes",
"version": "5395f5577f2ddc095088b4765b106fbc215e28fda9ae9433dcb13f718c8e3272"
"version": "d76e635d01b229185dffca78bc73e779e1a383322bf2cb3d30fbe427bc289db4"
},
"benchmarks.TimeSuite.time_parser_init": {
"code": "class TimeSuite:\n def time_parser_init(self):\n from xonsh_parser.parser import get_parser_cls\n \n parser = get_parser_cls()()\n parser.parse(\"ls -alh\")\n\n def setup(self):\n from xonsh_parser.parser import write_parser_table\n write_parser_table()",
Expand All @@ -29,21 +35,28 @@
"repeat": 0,
"rounds": 2,
"sample_time": 0.01,
"timeout": 60.0,
"type": "time",
"unit": "seconds",
"version": "e35e8735a755146c49df0237cdf07c88a943702d6e364e50d29f2aa842bb1990",
"warmup_time": -1
},
"benchmarks.TrackLrParserSize.track_lr_parser_size": {
"code": "class TrackLrParserSize:\n def track_lr_parser_size(self):\n from pympler import asizeof\n \n from xonsh_parser.parser import get_parser_cls\n \n parser = get_parser_cls()()\n return asizeof.asizeof(parser.parser)\n\n def setup(self):\n from xonsh_parser.parser import write_parser_table\n write_parser_table()",
"code": "class TrackLrParserSize:\n def track_lr_parser_size(self, f):\n from pympler import asizeof\n \n from xonsh_parser.parser import get_parser_cls\n \n parser = get_parser_cls()(parser_table=Path(f))\n return asizeof.asizeof(parser.parser)\n\n def setup(self, f):\n from xonsh_parser.parser import write_parser_table\n write_parser_table(output_path=f)",
"name": "benchmarks.TrackLrParserSize.track_lr_parser_size",
"param_names": [],
"params": [],
"timeout": 60.0,
"param_names": [
"param1"
],
"params": [
[
"'/tmp/xonsh-lr-table.pickle'",
"'/tmp/xonsh-lr-table.py'",
"'/tmp/xonsh-lr-table.jsonl'",
"'/tmp/xonsh-lr-table.cpickle'"
]
],
"type": "track",
"unit": "bytes",
"version": "5c2ee1d6c02c954f783f168723747a697bd1226feb95ad3c24dfa9eed58372e7"
"version": "dcdc63d51d9781e39b68aaff072ff921d29bcf2b78281dccc3826b520ca78a2d"
},
"version": 2
}
43 changes: 28 additions & 15 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -1,33 +1,46 @@
name: Build

on: [push, pull_request]
on:
push:
branches:
- main
pull_request:
branches:
- main

permissions:
contents: read

jobs:
test:

runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python_version: ['3.9']

os:
- ubuntu-latest
- macOS-latest
- windows-latest
python-version:
- "3.10"
- "3.11"
- "3.12"
- "3.13-dev"
name: Test Python ${{ matrix.python-version }} ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python_version }}
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install hatch
hatch env create
- name: Lint and typecheck
run: |
hatch run lint-check
pip install -U pip
pip install -e ".[test]"
- name: Test
run: |
hatch run test-cov-xml
- uses: codecov/codecov-action@v3
run: pytest --cov=peg_parser/parser tests/ --cov-report=term-missing --cov-report=xml
- uses: codecov/codecov-action@v4.0.1
with:
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: true
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -125,5 +125,6 @@ pdm.lock
# ply parser debug logs
*.out
.pdm-python
.task/

.history/
.testmondata*
24 changes: 9 additions & 15 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,29 +1,23 @@
repos:
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: "v0.0.263"
rev: "v0.4.2"
hooks:
- id: ruff
args: ["--fix"]

- repo: https://github.com/ambv/black
rev: "23.3.0"
hooks:
- id: black
args: [ ., --fix, --exit-non-zero-on-fix ]
pass_filenames: false
- id: ruff-format
args: [.]
pass_filenames: false
args:
- xonsh_parser
- tests
language_version: python3.8


- repo: https://github.com/pre-commit/mirrors-mypy
rev: "v1.2.0" # Use the sha / tag you want to point at
rev: "v1.10.0" # Use the sha / tag you want to point at
hooks:
- id: mypy
pass_filenames: false
args: [ "--config-file=pyproject.toml"]

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: "v4.4.0"
rev: "v4.6.0"
hooks:
- id: trailing-whitespace
- id: check-yaml
Expand All @@ -33,7 +27,7 @@ repos:
- id: check-added-large-files

- repo: https://github.com/compilerla/conventional-pre-commit
rev: "v2.2.0"
rev: "v3.2.0"
hooks:
- id: conventional-pre-commit
stages: [commit-msg]
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@ We use [Hatch](https://hatch.pypa.io/latest/install/) to manage the development
You can run all the tests with:

```bash
hatch run test
task test

# to watch for changes and run tests
task test --watch -- -x --ff
```

### Format the code
Expand Down
41 changes: 30 additions & 11 deletions Taskfile.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,37 @@
# https://taskfile.dev

version: '3'
tasks:
generate:
cmds:
- python3 peg_parser/tasks/generate_parser.py
generates:
- peg_parser/parser/parser.py
sources:
- peg_parser/parser/xonsh.gram
- peg_parser/parser/*.py
- pegen/*.py

profile:
cmds:
- python peg_parser/tasks/profile_mem.py tee "logs/xonsh-parser-$(date "+%Y%m%d-%H%M%S").log"

vars:
GREETING: Hello, World!
ply-add:
cmds:
- git subtree add --prefix=ply --squash

tasks:
add-ply-subtree:
pegen-add:
cmds:
- git subtree add --prefix=ply --squash https://github.com/dabeaz/ply.git master
add-pegen-subtree:
- git fetch https://github.com/we-like-parsers/pegen.git main:tmp-pegen-main --no-tags --depth 1
- git show tmp-pegen-main:data/python.gram > .local/python.gram
- git read-tree --prefix=pegen -u tmp-pegen-main:src/pegen

test:
deps:
- generate
cmds:
- git subtree add --prefix=pegen --squash
- python -m pytest {{.CLI_ARGS}}
sources:
- '**/*.py'

asv:
wtest:
cmds:
- asv run
- watchexec -e py,gram --clear -- task test -- --ff -x -vv --testmon
21 changes: 13 additions & 8 deletions benchmarks/benchmarks.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Write the benchmarking functions here.
# See "Writing benchmarks" in the asv docs for more information.
from pathlib import Path


class TimeSuite:
Expand Down Expand Up @@ -28,28 +29,32 @@ def mem_parser_init(self):


class PeakMemSuite:
def setup(self):
params = ["/tmp/xonsh-lr-table.pickle", "/tmp/xonsh-lr-table.py", "/tmp/xonsh-lr-table.jsonl"]
def setup(self, f):
from xonsh_parser.parser import write_parser_table
write_parser_table()
write_parser_table(output_path=f)

def peakmem_parser_init_(self, f):

def peakmem_parser_init(self):
from xonsh_parser.parser import get_parser_cls

parser = get_parser_cls()()
parser = get_parser_cls()(parser_table=Path(f))
parser.parse("ls -alh")


class TrackLrParserSize:
unit = "bytes"
params = ["/tmp/xonsh-lr-table.pickle", "/tmp/xonsh-lr-table.py", "/tmp/xonsh-lr-table.jsonl",
"/tmp/xonsh-lr-table.cpickle"]

def setup(self):
def setup(self, f):
from xonsh_parser.parser import write_parser_table
write_parser_table()
write_parser_table(output_path=f)

def track_lr_parser_size(self):
def track_lr_parser_size(self, f):
from pympler import asizeof

from xonsh_parser.parser import get_parser_cls

parser = get_parser_cls()()
parser = get_parser_cls()(parser_table=Path(f))
return asizeof.asizeof(parser.parser)
43 changes: 43 additions & 0 deletions experiments.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,46 @@ PosixPath('/tmp/v1-bytes.pickle')
# tried using marshal

- there was not much difference with pickle

# fixing reduce/reduce conflicts

1. initial sizes

```
asizeof.asizeof(productions)=260KiB
asizeof.asizeof(actions)=6.18MiB
asizeof.asizeof(gotos)=560KiB
```

| type | file-size |
|--------|-----------|
| pickle | 975.55KiB |
| py | 1.61 MiB |
| jsonl | 1.61 MiB |

2. merge overridden actions to base class - no change in sizes

```
_object_size(productions)='260.96 KiB'
_object_size(actions)='6.18 MiB'
_object_size(gotos)='560.97 KiB'
```

3. peak memory usage using lr-table.py type

benchmarks.PeakMemSuite.peakmem_parser_init_ ok
[75.00%] ··· ============================ =======
param1
---------------------------- -------
/tmp/xonsh-lr-table.pickle 31.5M
/tmp/xonsh-lr-table.py 215M
/tmp/xonsh-lr-table.jsonl 33.5M
============================ =======
benchmarks.TrackLrParserSize.track_lr_parser_size 1/4 failed
[100.00%] ··· ============================= ========
param1
----------------------------- --------
/tmp/xonsh-lr-table.pickle 7.54M
/tmp/xonsh-lr-table.py 4.8M
/tmp/xonsh-lr-table.jsonl 7.53M
/tmp/xonsh-lr-table.cpickle 7.54M
File renamed without changes.
2 changes: 2 additions & 0 deletions peg_parser/parser/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# generated files
parser.py
Empty file added peg_parser/parser/__init__.py
Empty file.
Loading