Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Parser) Add monadic parser combinator #79

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gvergnaud
Copy link
Owner

@gvergnaud gvergnaud commented Mar 7, 2023

Since one type-level parser implementation is worth a thousand words, here is how to implement a JSON parser:

type JSON = P.OneOf<[NullLit, StrLit, NumLit, Obj, Arr]>;

type NumLit = P.Number;

type NullLit = P.Do<[P.Literal<"null">, P.Ap<Constant<null>, []>]>;

type StrLit = P.Do<
  [
    P.Literal<'"'>,
    P.Let<"key", P.Word>,
    P.Literal<'"'>,
    P.Ap<Identity, ["key"]>
  ]
>;

type Obj = P.Do<
  [
    P.Literal<"{">,
    P.Optional<P.WhiteSpaces>,
    P.Let<"key", StrLit>,
    P.Optional<P.WhiteSpaces>,
    P.Literal<":">,
    P.Optional<P.WhiteSpaces>,
    P.Let<"value", JSON>,
    P.Optional<P.WhiteSpaces>,
    P.Literal<"}">,
    P.Ap<
      Compose<[Objects.FromEntries, Objects.Create<[arg0, arg1]>]>,
      ["key", "value"]
    >
  ]
>;

type CSV = P.Do<
  [
    P.Let<"first", JSON>,
    P.Optional<P.WhiteSpaces>,
    P.Let<
      "rest",
      P.Optional<
        P.Do<[P.Literal<",">, P.Optional<P.WhiteSpaces>, P.Return<CSV>]>,
        []
      >
    >,
    P.Ap<Tuples.Prepend, ["first", "rest"]>
  ]
>;

type Arr = P.Do<
  [
    P.Literal<"[">,
    P.Optional<P.WhiteSpaces>,
    P.Let<"values", CSV>,
    P.Optional<P.WhiteSpaces>,
    P.Literal<"]">,
    P.Ap<Identity, ["values"]>
  ]
>;

And here is how to use it:

type res1 = Call<JSON, '[{ "key": "hello" }]'>;
//    ^? ["", P.Ok<[{ key: "hello" }]>] 

type res2 = Call<JSON, '["a", "b"]'>;
//    ^? ["", P.Ok<["a", "b"]>] 

type res3 = Call<JSON, '["a", null]'>;
//    ^? ["", P.Ok<["a", null]>] 

type res4 = Call<JSON, "null">;
//    ^? ["", P.Ok<null>] 

type res5 = Call<
  //  ^? ["", P.Ok<[null]>] 
  Arr,
  "[null]"
>;

type res6 = Call<
  //  ^? ["", P.Ok<[1, 2, 3]>] 
  CSV,
  "1 ,2  ,  3"
>;

type res7 = Call<
  //  ^? ["", P.Ok<[1, null, 3]>] 
  CSV,
  "1 ,null  ,  3"
>;

type res8 = Call<
  //  ^? ["", P.Ok<[1, null, "str"]>] 
  CSV,
  '1 ,null  ,  "str"'
>;

type res9 = Call<
  //  ^? ["", P.Ok<[1, null, "str"]>] 
  JSON,
  '[1 ,null  ,  "str"]'
>;

type res10 = Call<
  //  ^? ["", P.Ok<[1, { a: { b: "hello" } }, "str"]>] 
  JSON,
  '[1, { "a": { "b": "hello" } },  "str"]'
>;

@gvergnaud gvergnaud self-assigned this Mar 7, 2023
@gvergnaud gvergnaud force-pushed the gvergnaud/parser-combinator-test branch from 2aa0ccb to 2177347 Compare March 7, 2023 09:26
@ecyrbe
Copy link
Collaborator

ecyrbe commented Mar 7, 2023

Owesome! 🎉

I love the Do Let Ap utils. makes everything simple to grasp, i think that nothing would prevent my implementation to also have it since it's just adding the ability to Sequence parsing to take a Fn that returns a Parser.

Our implementations are almost the same. The difference is only :

  • your parser is a naked function that returns a tuple
  • my parser is a :
    • augmented function that returns an object
    • add a name to basic parsers
    • remember the parameters
    • the last two above add automatic introspection, while your implementation needs to wrap every combinator with a mapError to create meaningful messages.

I think the debate wether introspection is a nice or not is not my concern, i could let this feature go away. i personnally think it allows easier error handling out of the box. But maybe adding mapError to customize errors is fine.

And about object vs tuple as the returned type of the parser, i prefer readbility for the user. I think object is also faster. But maybe i'm wrong.

Here are some minor things that are easily fixable :

  • When using literals to create numbers and word matching should be wrapped in a mapError to create a meaninfull message. Indeed, current implementation yields cryptic union of all possible literals that is not user friendly.
  • In general, we should wrap all common combinators with a mapError to make better error messages
  • Words should match common definition : [a-zA-Z_][0-9a-za-Z]*
  • implement more combinators (see my list)

But instead of diverging more. I think we should try to merge our implementations. I don't like spliting the effort here since both are almost identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants