Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new ParserExtra::State: Recorder trait #681

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jyn514
Copy link

@jyn514 jyn514 commented Oct 18, 2024

This allows using imperative concrete syntax tree parsers like rowan and cstree. In particular those libraries want to know every token that is parsed, and need to know when chumsky backtracks and reparses the same tokens again.

This adds the following new API surface:

pub trait Recorder<'a, I: Input<'a>>: Default {
    type SaveMarker: Copy + Clone;
    fn on_token(&mut self, token: I::Token);
    fn on_save<'parse>(&self, offset: I::Offset) -> Self::SaveMarker;
    fn on_rewind<'parse>(&mut self, marker: Marker<'a, 'parse, I, Self::SaveMarker>);
}
pub struct SimpleState<T>(pub T);
impl<'a, T, I: Input<'a>> Recorder<'a, I> for SimpleState<T>;
impl<T> DerefMut<Target = T> for SimpleState<T>;
impl<T> From<T> for SimpleState<T>;

and additionally now requires ParserExtra::State: Recorder.

i didn't write any of my own tests yet but my own parser that hooks chumsky up to cstree does work: spreadsheet-lang/spreadsheet@1b97313#diff-14962dfdd56397d99e07041d6337cb5b0327f4d7433d80fd3383955415014910R111

Fixes #96

This allows using imperative concrete syntax tree parsers like `rowan` and `cstree`.
In particular those libraries want to know every token that is parsed, and need to know when chumsky backtracks and reparses the same tokens again.

This adds the following new API surface:
```rust
pub trait Recorder<'a, I: Input<'a>>: Default {
    type SaveMarker: Copy + Clone;
    fn on_token(&mut self, token: I::Token);
    fn on_save<'parse>(&self, offset: I::Offset) -> Self::SaveMarker;
    fn on_rewind<'parse>(&mut self, marker: Marker<'a, 'parse, I, Self::SaveMarker>);
}
pub struct SimpleState<T>(pub T);
impl<'a, T, I: Input<'a>> Recorder<'a, I> for SimpleState<T>;
impl<T> DerefMut<Target = T> for SimpleState<T>;
impl<T> From<T> for SimpleState<T>;
```
and additionally now requires `ParserExtra::State: Recorder`.
@jyn514 jyn514 changed the title wip: Add [ParserExtra::State::Recorder] Add a new ParserExtra::State: Recorder trait Oct 19, 2024
@jyn514 jyn514 marked this pull request as ready for review October 19, 2024 03:42

/// A type that receives event hooks when certain parsing actions occur.
///
/// Unlike [`Parser::State`], this receives information about the incoming actions.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems irrelevant now.

#[allow(unused)] // for intra-doc links
use crate::Parser;

/// A type that receives event hooks when certain parsing actions occur.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd be tempted to just call this type State and remove the existing type alias(es) (State, Context, and Err). Recorder seems too specific a name for what this might eventually turn into.

type SaveMarker: Copy + Clone;

/// This function is called when a new token is read from the input stream.
fn on_token(&mut self, token: I::Token);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever getting invoked? I assume not, since this sadly doesn't generalise to by-ref and by-value inputs. The correct thing would be token: I::TokenMaybe.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the place to call it is in these functions:

  • InputRef::skip_while
  • InputRef::next_inner
  • InputRef::next_maybe_inner
  • InputRef::next_ref_inner

Comment on lines +67 to +71
impl<T> From<T> for SimpleState<T> {
fn from(value: T) -> Self {
Self(value)
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably also add From<SimpleState<T>> for T (does this break the orphan rules nowadays, or will it need to be Into? I can't remember).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support integration with cstree
2 participants