-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructured how commands like list
, help
, quit
are parsed to implement save
#555
Conversation
…ible parsing (eg tolerant of arbitrary whitespace) Commands report `ParseError`s if the syntax is invalid (eg too many or not enough args) This required making `Resolver::add_code_source` public and adding `Context::resolver_mut` to track source code positions of commands
Added some helper functions for commands
Thank you very much for your contribution — great changeset!
Thanks, this is something I had planned but never got around to it 😍. One huge benefit of your approach is that command parsing moves from the application (numbat-cli) to the library (numbat). This opens the possibility to offer command parsing / handling to other "frontends". For example, I think we could simplify this JS code as well: numbat/numbat-wasm/www/index.js Lines 41 to 76 in 8c4d563
In order to reduce code duplication even more (among frontends), we could potentially even move part of the command handling to the backend (library) as well. The whole "list xyz" => call ctx.print_xyz() logic, for example. One complication is the following: not all frontends support the same set of commands. I'm not suggesting that this should happen in this PR, by the way. Only if you are interested.
Very nice
I agree.
I think that's completely fine. And maybe even a better UX than asking (interactively) whether the file should be overwritten or not. |
numbat/src/command.rs
Outdated
/// or kind of arguments, e.g. `list foobar` | ||
pub fn parse_command<'a>( | ||
input: &'a str, | ||
resolver: &mut Resolver, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of passing in a Resolver
reference, could we also pass in a code_source_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitely. That is much cleaner.
I realize however that there is a bug in this PR, which is that for non-command inputs we end up running resolver.add_code_source
twice, which results in line numbers being off by a factor of ~2 for erroneous non-command inputs. (The precise factor depends on how many commands we've run, since those only trigger resolve.add_code_source
once.) I'm running into a bit of a chicken and egg problem: to report errors in commands requires a code source up front, but we don't know if we're even parsing a command until we parse at all. An obvious solution is to double-parse — check if we have a command at all (probably quite cheaply, using split_whitespace
to just grab the first word) and then if we do, fall back to a full parse with a code_source_id
. This way we only call resolve.add_code_source
at most once per input. Does this sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize however that there is a bug in this PR, which is that for non-command inputs we end up running
resolver.add_code_source
twice, which results in line numbers being off by a factor of ~2 for erroneous non-command inputs. (The precise factor depends on how many commands we've run, since those only triggerresolve.add_code_source
once.)
Minor: it's not really the line number that is wrong, but the "code source ID", right? The number in <input:…>
angle brackets.
I'm running into a bit of a chicken and egg problem: to report errors in commands requires a code source up front, but we don't know if we're even parsing a command until we parse at all. An obvious solution is to double-parse — check if we have a command at all (probably quite cheaply, using
split_whitespace
to just grab the first word) and then if we do, fall back to a full parse with acode_source_id
. This way we only callresolve.add_code_source
at most once per input. Does this sound good?
Hm. That would be a good reason to keep your old solution maybe? Or could you pass in a get_code_source_id
callable that would only be executed in the error case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: it's not really the line number that is wrong, but the "code source ID", right? The number in input:… angle brackets.
Yes, that was the issue.
Hm. That would be a good reason to keep your old solution maybe? Or could you pass in a get_code_source_id callable that would only be executed in the error case?
My old solution never actually worked. What I've done in my latest commit is split the parser into a struct SourcelessCommandParser
containing just the input, and then the CommandParser
wraps that and a code_source_id
. SourcelessCommandParser
has a fallible initializer that only succeeds if we are in fact looking at a command line. Only if it succeeds do we go ahead and generate the code_source_id
, and then we have everything we need to parse the command. The key is that you can't construct a CommandParser
in the first place unless it's been proven that we have a valid command line (meaning it starts with a command word, not that all the arguments are ok), so we never unnecessarily generate a code_source_id
.
In addition, now that I've stuck everything in CommandParser
(thanks for that advice), initialization of the parser is split from parsing, which makes for an overall cleaner API (returning Result
instead of Option<Result>
)
numbat/src/command.rs
Outdated
|
||
/// For tracking spans. Contains `(start, start+len)` for each (whitespace-separated) | ||
/// word in the input | ||
fn get_word_boundaries(input: &str) -> Vec<(u32, u32)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be implemented using str::split_whitespace
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think split_whitespace
keeps indices, which is what I need. I thought of using char_indices
and splitting on a predicate, but that would require two Vec
s which seemed like a needless allocation.
Note that there is a bug where non-command inputs run `add_course_source` twice, which leads to incorrect line numbers
…s of it (much cleaner)
…and input How: split `CommandParser` into a second struct, `SourcelessCommandParser`, that contains just the input, no `code_source_id` `SourcelessCommandParser` has a fallible initializer that only returns `Some(Self)` if the line is indeed a command (ie starts with a command word) Then we simply check if this initializer succeeded; if so then make a new `code_source_id` and construct the full `CommandParser`, then parse the command
…chable!`) Made "list" report incorrect number of args before reporting invalid arg, in the event that both errors were present
Replaced `ensure_zero_args!` macro with function, now that this is possible (since we can use `format!` instead of `concat!)
…t them from being removed when removing trailing whitespace on save
I will think about how to implement frontend-specific code for a separate PR. Where is the frontend we're using stored at runtime? |
Also, regarding this:
I was also considering having |
In a future PR I'd also like to implement better discoverability of commands. Either |
I think I would vote for keeping things easy and just using
Sounds good, but note that the commands depend on the frontend. Speaking of, we should also update the documentation: https://numbat.dev/doc/cli-usage.html#commands see the corresponding markdown files |
Oh, I just found out about this after trying it. I think this is something we should implement before we merge this. Saving the entire history ever (1000 lines for me) is not very helpful.
I don't think we need to do that for the global history. But for the local one (for the save command), that would be useful. We do something similar in the web frontend: numbat/numbat-wasm/www/index.js Lines 72 to 75 in b175c31
|
Shouldn't be hard to implement a per-session history. Two questions:
|
Sorry if that wasn't clear. I think ideally, the history as we have it right now on master should stay. This is helpful for Ctrl-R searching for commands from previous sessions. But the save command should only store the commands of the current session. In order to turn a interactive session into a Numbat script that can be fine tuned in a text editor. |
… `run_repl` Therefore changed `parse_and_evalute` to return a struct of `control_flow, errored` to let session history know whether the command in question errored Added tests to `session_history.rs` Code seems overall not terrible now
…>` and renamed some items accordingly
return Err(self.err_through_end_from(2, "`list` takes at most one argument")); | ||
} | ||
|
||
let items = match items { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any value to making this case insensitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I'd leave it case sensitive for now.
Improved save logic: allowed for 0 arguments, in which case session history will be saved to history.nbt Printed message on successful save
…cy with CLI Updated web docs
…orphic but more self-descriptive)
Thank you very much! Great work. |
Closes #187
To implement the
save
command, I implemented a simple command parser. This handles the following commands:help
info
clear
quit
/exit
list
/ls
save
(new)These were formerly implemented as a bespoke command matcher, matching on e.g.,
"list functions" | "ls functions"
. Now they are fully parsed, which both enables more flexibility such as additional whitespace between arguments, and error reporting in the event that a command is misused. For instance, before,list foo
would result inNow it results in
And
is now
Similar error reporting is implemented for the other commands.
But all of this was merely done to provide an easy way to add the
save
command — I didn't just want to continue down the path of ad-hoc string parsing. That said, currently the argument tosave
must have no whitespace characters in it, or else it will parse as multiple arguments. Nor does it process backslash-escape characters. I was debating whether to use Tokenizer to parse the argument the same way a String is parsed, but I figured I'd keep this PR small. That can be added later.The
save
command now looks likeThis will unconditionally overwrite the specified file if it exits. If writing fails, the error looks like
Some improvements that might be desirable in the future: