Restructured how commands like `list`, `help`, `quit` are parsed to implement `save` #555

rben01 · 2024-09-07T19:19:27Z

Closes #187

To implement the save command, I implemented a simple command parser. This handles the following commands:

help
info
clear
quit / exit
list / ls
save (new)

These were formerly implemented as a bespoke command matcher, matching on e.g., "list functions" | "ls functions". Now they are fully parsed, which both enables more flexibility such as additional whitespace between arguments, and error reporting in the event that a command is misused. For instance, before, list foo would result in

>>> list foo
error: while type checking
  ┌─ <input:2>:1:1
  │
1 │ list foo
  │ ^^^^ unknown identifier
  │
  = Did you mean 'pint'?

Now it results in

>>> list foo
error: while parsing
  ┌─ <input:1>:1:6
  │
1 │ list foo
  │      ^^^ Invalid command: if provided, the argument to `list` or `ls` must be one of: functions, dimensions, variables, units

And

>>> help sin
error: while type checking
  ┌─ <input:4>:1:1
  │
1 │ help sin
  │ ^^^^ unknown identifier
  │
  = Did you mean 'hex'?

is now

>>> help sin
error: while parsing
  ┌─ <input:6>:1:6
  │
1 │ help sin
  │      ^^^ Invalid command: `help` takes 0 arguments; use `info <item>` for information about an item

Similar error reporting is implemented for the other commands.

But all of this was merely done to provide an easy way to add the save command — I didn't just want to continue down the path of ad-hoc string parsing. That said, currently the argument to save must have no whitespace characters in it, or else it will parse as multiple arguments. Nor does it process backslash-escape characters. I was debating whether to use Tokenizer to parse the argument the same way a String is parsed, but I figured I'd keep this PR small. That can be added later.

The save command now looks like

save hist.nbt

This will unconditionally overwrite the specified file if it exits. If writing fails, the error looks like

>>> save .
error: runtime error
 = Could not write to file: "."

Some improvements that might be desirable in the future:

Filenames with spaces, presumably using ordinary string syntax (which would also handle backslash escape sequences)
- Interpolation? Not sure how much utility this would have but it would be natural if using strings as an argument
Confirmation when overwriting an existing file
Omitting lines that errored. IIUC this would require some changes to how history is stored, since it only stores strings and not results
Save per-session history, not entire history over all sessions ever

…ible parsing (eg tolerant of arbitrary whitespace) Commands report `ParseError`s if the syntax is invalid (eg too many or not enough args) This required making `Resolver::add_code_source` public and adding `Context::resolver_mut` to track source code positions of commands

Added some helper functions for commands

…ands

sharkdp · 2024-09-08T10:46:32Z

Thank you very much for your contribution — great changeset!

I implemented a simple command parser

Thanks, this is something I had planned but never got around to it 😍. One huge benefit of your approach is that command parsing moves from the application (numbat-cli) to the library (numbat). This opens the possibility to offer command parsing / handling to other "frontends". For example, I think we could simplify this JS code as well:

numbat/numbat-wasm/www/index.js

Lines 41 to 76 in 8c4d563

    
           if (input_trimmed == "clear") { 
        
               this.clear(); 
        
               var output = ""; 
        
           } else if (input_trimmed == "reset") { 
        
               numbat = create_numbat_instance(); 
        
               numbat.interpret("use units::currencies"); 
        
               combined_input = ""; 
        
               updateUrlQuery(null); 
        
               this.clear(); 
        
           } else if (input_trimmed == "list" || input_trimmed == "ls") { 
        
               output = numbat.print_environment(); 
        
           } else if (input_trimmed == "list functions" || input_trimmed == "ls functions") { 
        
               output = numbat.print_functions(); 
        
           } else if (input_trimmed == "list dimensions" || input_trimmed == "ls dimensions") { 
        
               output = numbat.print_dimensions(); 
        
           } else if (input_trimmed == "list variables" || input_trimmed == "ls variables") { 
        
               output = numbat.print_variables(); 
        
           } else if (input_trimmed == "list units" || input_trimmed == "ls units") { 
        
               output = numbat.print_units(); 
        
           } else if (input_trimmed == "help" || input_trimmed == "?") { 
        
               output = numbat.help(); 
        
           } else { 
        
               var result = { is_error: false }; 
        
               if (input_trimmed.startsWith("info ")) { 
        
                   var keyword = input_trimmed.substring(4).trim(); 
        
                   output = numbat.print_info(keyword); 
        
               } else { 
        
                   result = numbat.interpret(input); 
        
                   output = result.output; 
        
               } 
        
               if (!result.is_error) { 
        
                   combined_input += input.trim() + "⏎"; 
        
                   updateUrlQuery(combined_input); 
        
               } 
        
           }

In order to reduce code duplication even more (among frontends), we could potentially even move part of the command handling to the backend (library) as well. The whole "list xyz" => call ctx.print_xyz() logic, for example.

One complication is the following: not all frontends support the same set of commands. save would not be supported by the Web frontend, for example. And some commands definitely need frontend-specific code. Like a future copy command (#394).

I'm not suggesting that this should happen in this PR, by the way. Only if you are interested.

Now they are fully parsed, which both enables more flexibility such as additional whitespace between arguments, and error reporting in the event that a command is misused.

Very nice

That said, currently the argument to save must have no whitespace characters in it, or else it will parse as multiple arguments. Nor does it process backslash-escape characters. I was debating whether to use Tokenizer to parse the argument the same way a String is parsed, but I figured I'd keep this PR small. That can be added later.

I agree.

This will unconditionally overwrite the specified file if it exits.

I think that's completely fine. And maybe even a better UX than asking (interactively) whether the file should be overwritten or not.

sharkdp · 2024-09-08T10:53:27Z

numbat/src/command.rs

+///   or kind of arguments, e.g. `list foobar`
+pub fn parse_command<'a>(
+    input: &'a str,
+    resolver: &mut Resolver,


Instead of passing in a Resolver reference, could we also pass in a code_source_id?

Yes, definitely. That is much cleaner.

I realize however that there is a bug in this PR, which is that for non-command inputs we end up running resolver.add_code_source twice, which results in line numbers being off by a factor of ~2 for erroneous non-command inputs. (The precise factor depends on how many commands we've run, since those only trigger resolve.add_code_source once.) I'm running into a bit of a chicken and egg problem: to report errors in commands requires a code source up front, but we don't know if we're even parsing a command until we parse at all. An obvious solution is to double-parse — check if we have a command at all (probably quite cheaply, using split_whitespace to just grab the first word) and then if we do, fall back to a full parse with a code_source_id. This way we only call resolve.add_code_source at most once per input. Does this sound good?

I realize however that there is a bug in this PR, which is that for non-command inputs we end up running resolver.add_code_source twice, which results in line numbers being off by a factor of ~2 for erroneous non-command inputs. (The precise factor depends on how many commands we've run, since those only trigger resolve.add_code_source once.)

Minor: it's not really the line number that is wrong, but the "code source ID", right? The number in <input:…> angle brackets.

I'm running into a bit of a chicken and egg problem: to report errors in commands requires a code source up front, but we don't know if we're even parsing a command until we parse at all. An obvious solution is to double-parse — check if we have a command at all (probably quite cheaply, using split_whitespace to just grab the first word) and then if we do, fall back to a full parse with a code_source_id. This way we only call resolve.add_code_source at most once per input. Does this sound good?

Hm. That would be a good reason to keep your old solution maybe? Or could you pass in a get_code_source_id callable that would only be executed in the error case?

Minor: it's not really the line number that is wrong, but the "code source ID", right? The number in input:… angle brackets.

Yes, that was the issue.

Hm. That would be a good reason to keep your old solution maybe? Or could you pass in a get_code_source_id callable that would only be executed in the error case?

My old solution never actually worked. What I've done in my latest commit is split the parser into a struct SourcelessCommandParser containing just the input, and then the CommandParser wraps that and a code_source_id. SourcelessCommandParser has a fallible initializer that only succeeds if we are in fact looking at a command line. Only if it succeeds do we go ahead and generate the code_source_id, and then we have everything we need to parse the command. The key is that you can't construct a CommandParser in the first place unless it's been proven that we have a valid command line (meaning it starts with a command word, not that all the arguments are ok), so we never unnecessarily generate a code_source_id.

In addition, now that I've stuck everything in CommandParser (thanks for that advice), initialization of the parser is split from parsing, which makes for an overall cleaner API (returning Result instead of Option<Result>)

sharkdp · 2024-09-08T10:57:13Z

numbat/src/command.rs

+
+/// For tracking spans. Contains `(start, start+len)` for each (whitespace-separated)
+/// word in the input
+fn get_word_boundaries(input: &str) -> Vec<(u32, u32)> {


Could this be implemented using str::split_whitespace?

I don't think split_whitespace keeps indices, which is what I need. I thought of using char_indices and splitting on a predicate, but that would require two Vecs which seemed like a needless allocation.

numbat/src/command.rs

Note that there is a bug where non-command inputs run `add_course_source` twice, which leads to incorrect line numbers

…s of it (much cleaner)

…and input How: split `CommandParser` into a second struct, `SourcelessCommandParser`, that contains just the input, no `code_source_id` `SourcelessCommandParser` has a fallible initializer that only returns `Some(Self)` if the line is indeed a command (ie starts with a command word) Then we simply check if this initializer succeeded; if so then make a new `code_source_id` and construct the full `CommandParser`, then parse the command

…chable!`) Made "list" report incorrect number of args before reporting invalid arg, in the event that both errors were present

numbat/src/command.rs

Replaced `ensure_zero_args!` macro with function, now that this is possible (since we can use `format!` instead of `concat!)

…t them from being removed when removing trailing whitespace on save

rben01 · 2024-09-08T19:34:52Z

One complication is the following: not all frontends support the same set of commands. save would not be supported by the Web frontend, for example. And some commands definitely need frontend-specific code. Like a future copy command (#394).

I will think about how to implement frontend-specific code for a separate PR. Where is the frontend we're using stored at runtime?

rben01 · 2024-09-11T15:22:28Z

Also, regarding this:

This will unconditionally overwrite the specified file if it exits.

I think that's completely fine. And maybe even a better UX than asking (interactively) whether the file should be overwritten or not.

I was also considering having save and save!, the former asking for confirmation if overwriting and the latter overwriting without confirmation. If desirable, then I should probably change save to save! in this PR and the come back and implement the save command in the future. This might be a desirable change anyway just to indicate that we're going to overwrite (exclamation points indicate convey that something scary is happening) — users may exercise a tad more caution if it's made clear that they aren't getting a second chance.

rben01 · 2024-09-11T15:29:41Z

In a future PR I'd also like to implement better discoverability of commands. Either help commands or list commands would be a good addition to actually list them all similar to what you'd get from cmd --help on the command line, and shouldn't be hard to do if we're willing to hard-code them, which seems perfectly reasonable. I would lean toward help commands just because list seems to be for runtime-specific values, whereas commands live in a totally separate namespace. I also think it'd make sense to list commands, or at least how to get a list of them, in help’s output. But again, future PR.

sharkdp · 2024-09-11T17:54:56Z

I was also considering having save and save!, the former asking for confirmation if overwriting and the latter overwriting without confirmation. If desirable, then I should probably change save to save! in this PR and the come back and implement the save command in the future. This might be a desirable change anyway just to indicate that we're going to overwrite (exclamation points indicate convey that something scary is happening) — users may exercise a tad more caution if it's made clear that they aren't getting a second chance.

I think I would vote for keeping things easy and just using save with a overwrite-default.

In a future PR I'd also like to implement better discoverability of commands. Either help commands or list commands would be a good addition to actually list them all similar to what you'd get from cmd --help on the command line, and shouldn't be hard to do if we're willing to hard-code them, which seems perfectly reasonable. I would lean toward help commands just because list seems to be for runtime-specific values, whereas commands live in a totally separate namespace. I also think it'd make sense to list commands, or at least how to get a list of them, in help’s output. But again, future PR.

Sounds good, but note that the commands depend on the frontend. Speaking of, we should also update the documentation:

https://numbat.dev/doc/cli-usage.html#commands
https://numbat.dev/doc/web-usage.html#commands

see the corresponding markdown files

sharkdp · 2024-09-13T20:21:49Z

Save per-session history, not entire history over all sessions ever

Oh, I just found out about this after trying it. I think this is something we should implement before we merge this. Saving the entire history ever (1000 lines for me) is not very helpful.

Omitting lines that errored. IIUC this would require some changes to how history is stored, since it only stores strings and not results

I don't think we need to do that for the global history. But for the local one (for the save command), that would be useful. We do something similar in the web frontend:

numbat/numbat-wasm/www/index.js

Lines 72 to 75 in b175c31

    
           if (!result.is_error) { 
        
               combined_input += input.trim() + "⏎"; 
        
               updateUrlQuery(combined_input); 
        
           }

rben01 · 2024-09-14T22:25:01Z

Oh, I just found out about this after trying it. I think this is something we should implement before we merge this. Saving the entire history ever (1000 lines for me) is not very helpful.

Shouldn't be hard to implement a per-session history. Two questions:

Should per-session history be a separate PR (to land before this one) or part of this one? This PR is already quite large, I'm thinking separate.
Should the implementation go in numbat or numbat-cli?

sharkdp · 2024-09-15T05:21:31Z

Sorry if that wasn't clear. I think ideally, the history as we have it right now on master should stay. This is helpful for Ctrl-R searching for commands from previous sessions.

But the save command should only store the commands of the current session. In order to turn a interactive session into a Numbat script that can be fine tuned in a text editor.

… reading prelude

… `run_repl` Therefore changed `parse_and_evalute` to return a struct of `control_flow, errored` to let session history know whether the command in question errored Added tests to `session_history.rs` Code seems overall not terrible now

…>` and renamed some items accordingly

numbat/src/command.rs

rben01 · 2024-09-18T16:26:10Z

numbat/src/command.rs

+                    return Err(self.err_through_end_from(2, "`list` takes at most one argument"));
+                }
+
+                let items = match items {


Any value to making this case insensitive?

I don't know. I'd leave it case sensitive for now.

…ands

Improved save logic: allowed for 0 arguments, in which case session history will be saved to history.nbt Printed message on successful save

…cy with CLI Updated web docs

…orphic but more self-descriptive)

sharkdp · 2024-09-24T13:32:08Z

Thank you very much! Great work.

rben01 and others added 8 commits September 7, 2024 13:30

Added .DS_Store to gitignore

47fd928

Added Info command

2d45512

Added some helper functions for commands

Tests for Command

5a0ff28

Continued REPL if save dst errored instead of just exiting

3e89134

Reduced some copy-pasted code

1318619

Merge branch 'master' into commands

9ef2f8b

Merge branch 'commands' of https://github.com/rben01/numbat into comm…

314ad41

…ands

sharkdp reviewed Sep 8, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 added 4 commits September 8, 2024 11:44

Passed code_source_id to parse_command instead of a Resolver

4615178

Note that there is a bug where non-command inputs run `add_course_source` twice, which leads to incorrect line numbers

Made struct CommandParser and reimplemented command parsing in term…

32bf547

…s of it (much cleaner)

Enum-ified command kind, making it strongly typed (no need for `unrea…

2723c14

…chable!`) Made "list" report incorrect number of args before reporting invalid arg, in the event that both errors were present

sharkdp reviewed Sep 8, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

numbat/src/command.rs Outdated Show resolved Hide resolved

sharkdp reviewed Sep 8, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 added 4 commits September 8, 2024 15:24

Changed InvalidCommand to take a String instead of &'static str

f929215

Replaced `ensure_zero_args!` macro with function, now that this is possible (since we can use `format!` instead of `concat!)

Replaced trailing space characters in two tests with \x20 to preven…

336eeca

…t them from being removed when removing trailing whitespace on save

Removed alias "ls" for "list" command

4cd8c56

Removed unused CommandParser::set_code_source_id

6fed904

rben01 added 2 commits September 15, 2024 00:36

Initial, simple implementation of saving per-session history

1f34d27

Moved session_history storage from Cli::repl into Cli::repl_loop

50dd0fd

rben01 added 5 commits September 15, 2024 16:53

Simplified history implementation, at the cost of one allocation when…

6b1a863

… reading prelude

Renamed session history file to history.nbt

ac1bd61

Replaced boolean errored with isomorphic but clearer `Result<(), ()…

8f2a804

…>` and renamed some items accordingly

Moved writeln! arg into format string

e4e6c37

rben01 commented Sep 18, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 commented Sep 18, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 commented Sep 18, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 commented Sep 18, 2024

View reviewed changes

numbat/src/command.rs Outdated Show resolved Hide resolved

rben01 commented Sep 18, 2024

View reviewed changes

rben01 and others added 7 commits September 21, 2024 19:12

Minor cleanup

efb7250

Restored accidentatlly removed alias ? for help

ea859dd

Merge branch 'sharkdp:master' into commands

20d6815

Merge branch 'commands' of https://github.com/rben01/numbat into comm…

3f987a2

…ands

Merge branch 'session-history' into commands

87b0812

Improved save logic: allowed for 0 arguments, in which case session history will be saved to history.nbt Printed message on successful save

Updated cli docs

b0361a9

Removed ls as alias for command list in web version for consisten…

8a278bc

…cy with CLI Updated web docs

rben01 mentioned this pull request Sep 22, 2024

Functionality to easily copy output of last calculation to clipboard #394

Open

rben01 and others added 2 commits September 22, 2024 11:07

Replaced CommandKind::new with impl FromStr for CommandKind (isom…

5c55c45

…orphic but more self-descriptive)

Merge branch 'master' into commands

ab4205e

sharkdp merged commit 21b09da into sharkdp:master Sep 24, 2024
15 checks passed

BrewTestBot mentioned this pull request Oct 11, 2024

numbat 1.14.0 Homebrew/homebrew-core#193820

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructured how commands like `list`, `help`, `quit` are parsed to implement `save` #555

Restructured how commands like `list`, `help`, `quit` are parsed to implement `save` #555

rben01 commented Sep 7, 2024 •

edited

Loading

sharkdp commented Sep 8, 2024

sharkdp Sep 8, 2024

rben01 Sep 8, 2024

sharkdp Sep 8, 2024

rben01 Sep 8, 2024 •

edited

Loading

sharkdp Sep 8, 2024

rben01 Sep 8, 2024

rben01 commented Sep 8, 2024

rben01 commented Sep 11, 2024

rben01 commented Sep 11, 2024

sharkdp commented Sep 11, 2024

sharkdp commented Sep 13, 2024 •

edited

Loading

rben01 commented Sep 14, 2024

sharkdp commented Sep 15, 2024

rben01 Sep 18, 2024

sharkdp Sep 24, 2024

sharkdp commented Sep 24, 2024

Restructured how commands like list, help, quit are parsed to implement save #555

Restructured how commands like list, help, quit are parsed to implement save #555

Conversation

rben01 commented Sep 7, 2024 • edited Loading

sharkdp commented Sep 8, 2024

sharkdp Sep 8, 2024

Choose a reason for hiding this comment

rben01 Sep 8, 2024

Choose a reason for hiding this comment

sharkdp Sep 8, 2024

Choose a reason for hiding this comment

rben01 Sep 8, 2024 • edited Loading

Choose a reason for hiding this comment

sharkdp Sep 8, 2024

Choose a reason for hiding this comment

rben01 Sep 8, 2024

Choose a reason for hiding this comment

rben01 commented Sep 8, 2024

rben01 commented Sep 11, 2024

rben01 commented Sep 11, 2024

sharkdp commented Sep 11, 2024

sharkdp commented Sep 13, 2024 • edited Loading

rben01 commented Sep 14, 2024

sharkdp commented Sep 15, 2024

rben01 Sep 18, 2024

Choose a reason for hiding this comment

sharkdp Sep 24, 2024

Choose a reason for hiding this comment

sharkdp commented Sep 24, 2024

Restructured how commands like `list`, `help`, `quit` are parsed to implement `save` #555

Restructured how commands like `list`, `help`, `quit` are parsed to implement `save` #555

rben01 commented Sep 7, 2024 •

edited

Loading

rben01 Sep 8, 2024 •

edited

Loading

sharkdp commented Sep 13, 2024 •

edited

Loading