-
Notifications
You must be signed in to change notification settings - Fork 4
filter: support parsing numeric index #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f67338d
to
20e50c6
Compare
src/filter/parser.rs
Outdated
use super::*; | ||
|
||
#[test] | ||
fn name_filter() -> Result<(), Box<dyn Error>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the return type as the tests were either panic-ing due to assert_eq!()
or returning Ok(())
, so this should be equivalent with a bit less boilerplate for test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally reasonable
src/filter/grammar.pest
Outdated
quoted_name = { quoted_char+ } | ||
quoted_char = _{ !(quote) ~ ANY } | ||
prop = _{ "." ~ name } | ||
name = { !ASCII_DIGIT+ ~ id_char+ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!ASCII_DIGIT+
is a bit too permissive, this means you can start names with anything except digits.
I think what you meant to do here is ASCII_ALPHA ~ id_char+
, though I wonder if starting with _
is legal? (edit: the answer is that it's much more complicated) in which case it'd be a little more complicated:
name = { start_char ~ id_char+ }
start_char = _{ ASCII_ALPHA | "_" }
id_char = _{ ASCII_ALPHANUMERIC | "_" }
note that in both cases there is no +
because we are only enforcing what the name starts with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed your disclaimer "This should not be released as a new version yet" -- we can do this "first-character-of-id" enforcement in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!ASCII_DIGIT+ is a bit too permissive, this means you can start names with anything except digits.
I think what you meant to do here is ASCII_ALPHA ~ id_char+
This makes sense. Will update.
though I wonder if starting with _ is legal
It seems like it may be legal from the docs. Let me update this.
src/filter/grammar.pest
Outdated
label = { label_char+ } | ||
label_char = _{ ASCII_ALPHANUMERIC | " " | "_" | "-" } | ||
filter = _{ SOI ~ field ~ (field)* ~ EOI } | ||
field = { (index | prop) ~ (labels | index)? } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't pretend to be an expert at writing a grammar, but I think this dual appearance of index
in this field
rule is getting a bit messy. I'm kinda curious if your entire change to the grammar could have been something like:
index = _{ "[" ~ (string_index | numeric_index) ~ "]" }
string_index = _{ quote ~ quoted_name ~ quote }
numeric_index = { ASCII_DIGIT+ }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also my first time working with grammars 😅
Let me mess around with this and see if there's an alternative that is more simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For field
, what would a case of index ~ labels
look like? Would it be something like .attr["index"]{"label"}
? I'm wondering if we can drop the initial index
all together and just have
field = { (index | prop) ~ (labels | index)? } | |
field = { prop ~ (labels | index)? } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long winded explanation...
the field
here is the same as the Field
type, it has a name
and labels
that struct even has a block comment on it:
/// one segment of a filter ([`parse_filter`] returns `Vec<Field>`)
///
/// e.g. for the filter `'.foo{"bar"}.baz'` there are two segments:
///
/// * the name "foo" and the label "bar"
/// * the name "baz"
a name
gets set one of two ways:
- as a "prop" - it's a dot followed by a bare word identifier like
.foo
- as an "index" - it's got square brackets and quotes, like
["foo"]
so if you have some HCL like:
thing "labeled" {
with_var = 3
}
then you can query a few different ways that are all valid and yield the same result:
hq '.thing.with_var'
hq '.thing{"labeled"}.with_var'
hq '["thing"]{"labeled"}.with_var'
hq '["thing"]{"labeled"}["with_var"]
for all of these there are exactly two Field
s: the one with name: "thing"
and the one with name: "with_var"
(and the first field sometimes has labels: vec!["labeled"]
)
but to get back to your question...
I'm wondering if we can drop the initial
index
all together
I think the answer is yes, because the "prop" vs "index" distinction is mostly there to support map keys, which can't be at the top level of an HCL document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the answer is yes
that said, do you mind changing that in its own separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way... I don't think changing the field
rule to this:
field = { prop ~ (labels | index)? }
is exactly the same as what you proposed or I agreed to.
I think what we're saying is that the first field is:
first_field = { prop ~ labels? }
and all other fields that follow are:
field = { (index | prop) ~ labels? }
so the filter
rule should be:
filter = _{ SOI ~ first_field ~ (field)* ~ EOI }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can open up a separate PR for that. If we move the ability for a field to start with a label, would this be fine?
field = { (index | prop) ~ (labels | index)? } | |
field = { prop ~ (labels | index)? } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move the ability for a field to start with a label
I don't know what you mean...
(and sorry, time's up tonight. I'll look tomorrow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make the requested changes. Going to mark this as a draft for now until I can address the comments later today.
src/filter/grammar.pest
Outdated
label = { label_char+ } | ||
label_char = _{ ASCII_ALPHANUMERIC | " " | "_" | "-" } | ||
filter = _{ SOI ~ field ~ (field)* ~ EOI } | ||
field = { (index | prop) ~ (labels | index)? } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also my first time working with grammars 😅
Let me mess around with this and see if there's an alternative that is more simple.
src/filter/grammar.pest
Outdated
quoted_name = { quoted_char+ } | ||
quoted_char = _{ !(quote) ~ ANY } | ||
prop = _{ "." ~ name } | ||
name = { !ASCII_DIGIT+ ~ id_char+ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!ASCII_DIGIT+ is a bit too permissive, this means you can start names with anything except digits.
I think what you meant to do here is ASCII_ALPHA ~ id_char+
This makes sense. Will update.
though I wonder if starting with _ is legal
It seems like it may be legal from the docs. Let me update this.
no rush! I like the exploration happening here 🙂 |
sorry, I've given your branch conflicts... |
No worries! |
As discussed here[^1], fields should not be able to start with labels. Fields should specifically start with a prop which is optionally followed by a label. [^1]: miller-time#24 (comment)
As discussed here[^1], fields should not be able to start with labels. Fields should specifically start with a prop which is optionally followed by a label. [^1]: miller-time#24 (comment)
The intention of this is to eventually allow folks to query and delete specific indicies within a list type. Additionally, this adds further restrictions to prevent parsing properties and names that start with digits which is not allowed in HCL. This should not be released as a new version yet as it is not used in any of the subcommands. I mainly opened this PR separate from the implementation to get feedback before I spend time on the implementation.
20e50c6
to
538058e
Compare
As discussed here[^1], fields should not be able to start with labels. Fields should specifically start with a prop which is optionally followed by a label. [^1]: miller-time#24 (comment)
The intention of this is to eventually allow folks to query and delete
specific indicies within a list type. Additionally, this adds further
restrictions to prevent parsing properties and names that start with
digits which is not allowed in HCL.
This should not be released as a new version yet as it is not used
in any of the subcommands. I mainly opened this PR separate from
the implementation to get feedback before I spend time on the
implementation.