jqjq

jq implementation of jq

Why? It started when researching how to write decoders in jq for fq which ended up involving some AST rewriting and walking and then it escalated a bit.

It's also a great way to show that jq is a very expressive, capable and neat language!

You can play around with it using this jqplay snippet.

Use via `jqjq` wrapper

$ ./jqjq -n 'def f: 1,8; [f,f] | map(.+105) | implode'
"jqjq"

$ ./jqjq '.+. | map(.+105) | implode' <<< '[1,8]'
"jqjq"

# eval example above using jqjq in jqjq. will take some time.
# eval the concatenation of jqjq.jq as a string and the example
$ ./jqjq "eval($(jq -Rs . jqjq.jq)+.)" <<< '"eval(\"def f: 1,8; [f,f] | map(.+105) | implode\")"'
"jqjq"

# jqjq has a REPL
$ ./jqjq --repl
> 1,2,3 | .*2
2
4
6
> "jqjq" | explode | map(.-32) | implode
"JQJQ"
> "jqjq" | [eval("explode[] | .-32")] | implode
"JQJQ"
> ^D

# run 01mf02's adaptation of itchyny's bf.jq running fib.bf
$ ./jqjq -n "\"$(cat fib.bf)\" | $(cat bf.jq)"
"1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233"

$ ./jqjq -h
jqjq - jq implementation of jq
Usage: jqjq [OPTIONS] [--] [EXPR]
  --jq PATH        jq implementation to run with
  --lex            Lex EXPR
  --no-builtins    No builtins
  --null-input,-n  Null input
  --parse          Lex and parse EXPR
  --repl           REPL
  --run-tests      Run jq tests from stdin
  --slurp,-s       Slurp inputs into an array

Use with `jq`

$ jq -n -L . 'include "jqjq"; eval("def f: 1,8; [f,f] | map(.+105) | implode")'
"jqjq"

$ jq -L . 'include "jqjq"; eval("(.+.) | map(.+105) | implode")' <<< '[1,8]'
"jqjq"

Run tests

make test

Note that the tests are meant to be used with jq 1.7.1.

Progress

jq's test suite

$ ./jqjq --run-tests < ../jq/tests/jq.test | grep passed
310 of 449 tests passed

Note that expected test values are based on stedolan's jq. If you run with a different jq implementation like gojq some tests might fail because of different error messages, support for arbitrary precision integers etc.

Design overview

jqjq has the common lex, parse, eval design.

Lex

Lexer gets a string and chews off parts from left to right producing an array of tokens [{<name>: ...}, ...]. Each chew is done by testing regex:s in a priority order to make sure to match longer prefixes first, ex: += is matched before +. For a match a lambda is evaluated, usually {<token-name>: .}, but in some cases like for quoted strings it is a bit more complicated.

The lexer also has a stack to keep track of balance of seen (, ) and \( to properly know how to chop of a string with interpolation into tokens. e.g. is ) a right parenthesis or continuation of a string as in "abc \(123) def"?

You can use ./jqjq --lex '...' to lex and see the tokens.

Parse

Parser takes an array of tokens and uses a left-to-right (LR) parser with backtracking in combination with precedence climbing for infix operators to not end up in an infinite loop (ex parser rule E -> E + E). Backtracking is done by outputting empty for non-match and // to try the next rule, ex: a // b // ... // error where a and b are functions that try to match a rule. When a rule has matched it returns an array with the pair [<tokens left>, <ast>]. <ast> uses the same AST design as gojq.

You can use ./jqjq --parse '...' to lex and parse and see the AST tree.

Eval

Eval is done by traversing the AST tree and evaluates each AST node and also keeps track of the current path and environment.

Path is used in jq to keep track of current path to where you are in the input. The tracking can be done as long as you only operate on the input and don't create and output new values. I.e path(.a[1]), .b outputs ["a",1] and ["b"]. This is also used to implement assignment and some other operators.

Environment is an object with current functions and bindings. Functions have the key name <name>/<arity> and the value is a function AST. Bindings use the key name $<name>/0 and the value is {value: <value>} where value is normal jq value.

When evaluating the AST eval function get the current AST node, path and environment and will output zero, one or more arrays with the pair [<path>, <value>]. Path can be [null] if the evaluation produced a "new" value etc so that path tracking is not possible.

Problems, issues and unknowns

Better error messages.
The "environment" pass around is not very efficient and also it makes support recursion a bit awkward (called function is injected in the env at call time).
"," operator in jq (and gojq) is left associate but for the way jqjq parses it creates the correct parse tree when it's right associate. Don't know why.
Suffix with multiple [] outputs values in wrong order.
String literal using interpolation that has more than one generator outputs in wrong order. Ex: "\(1,2) \(3,4)".
Non-associative operators like == should fail, ex: 1 == 2 == 3.
Objects are parsed differently compared to gojq. gojq has a list of pipe queries, jqjq will only have one that might be pipe op.
Less "passthrough" piggyback on jq features:
- reduce/foreach via recursive function? similar to if or {}-literal?
- try/catch via some backtrack return value? change [path, value] to include an error somehow?
How to support label/break?
How to support delpaths (usd by del etc). Have to keep paths the same while deleting a group of paths? use sentinel value? work with paths instead?
Rewrite AST before eval, currently if and some others do rewrite (optional parts etc) while evaluating.
Rethink invalid path handling, current [null] is used as sentinel value.
{a:123} | .a |= empty should remove the key.

Useful references

Tools and tricks

jq -n --debug-dump-disasm '...' show jq byte code
jq -n --debug-trace=all '...' show jq byte code run trace
jq -n '{a: "hello"} | debug' 2> >(jq -R 'gsub("\u001b\\[.*?m";"") | fromjson' >&2) pretty print debug messages
GOJQ_DEBUG=1 go run -tags gojq_debug cmd/gojq/main.go -n '...' run gojq in debug mode
fq -n '".a.b" | _query_fromstring' gojq parse tree for string
fq -n '{...} | _query_tostring' jq expression string for gojq parse tree
For a convenient jq development experience:

Thanks to

stedolan for jq and got me interested in generator/backtracking based languages.
pkoppstein for writing about jq and PEG parsing.
itchyny for jqjq fixes and gojq from which i learned a lot and is also from where most of jqjq's AST design comes from. Sharing AST design made it easier to compare parser output (ex via fq's _query_fromstring). gojq also fixes some confusing jq bugs and has better error messages which saves a lot of time.
Michael Färber @01mf02 for jaq and where I also learned about precedence climbing.
Thalia Archibald @thaliaarchi for correctness fixes, builtins, more CLI arguments, fancy and colorful output and more.

License

jqjq is distributed under the terms of the MIT License.

See the LICENSE file for license details.

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
jqjq		jqjq
jqjq.jq		jqjq.jq
jqjq.test		jqjq.test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jqjq

Use via `jqjq` wrapper

Use with `jq`

Run tests

Progress

jq's test suite

Design overview

Lex

Parse

Eval

Problems, issues and unknowns

Useful references

Tools and tricks

Thanks to

License

About

Releases

Packages

Languages

License

yuiseki/jqjq

Folders and files

Latest commit

History

Repository files navigation

jqjq

Use via jqjq wrapper

Use with jq

Run tests

Progress

jq's test suite

Design overview

Lex

Parse

Eval

Problems, issues and unknowns

Useful references

Tools and tricks

Thanks to

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Use via `jqjq` wrapper

Use with `jq`

Packages