Skip to content

How would you design a list parser that supports Oxford-comma format? #119

@ogregoire

Description

@ogregoire

Let's consider a connector being either and or or, and a list that needs to be parsed, which has the Oxford-comma format. Let's consider that the elements are a custom parser.

It should be able to parse the following use cases:

"a".             -> just an element
"a and b"        -> two elements, connected with "and"
"a or b"         -> two elements, connected with "or"
"a, b, and c"    -> three elements, connected with "and"
"a, b, c, or d"  -> four elements, connected with "or"

I have a hard time designing the most appropriate parser. This is what I came up with:

    private static final Parser<Character> COMMA = one(',');
    private static final Parser<Connector> CONNECTOR = anyOf(
            word("and").thenReturn(Connector.AND),
            word("or").thenReturn(Connector.OR));

    static <T> Parser<JoinedList<T>> list(Parser<T> element) {
        Parser<JoinedList<T>> threeOrMoreTail = COMMA.then(sequence(
                element.followedBy(COMMA).atLeastOnce(),
                CONNECTOR,
                element,
                (middle, conn, last) -> new JoinedList<T>()
                        .addAll(middle)
                        .connector(conn)
                        .add(last)));
        Parser<JoinedList<T>> pairTail = sequence(
                CONNECTOR,
                element,
                (conn, last) -> new JoinedList<T>()
                        .connector(conn)
                        .add(last));
        return element.map(first -> new JoinedList<T>().add(first))
                .optionallyFollowedBy(anyOf(threeOrMoreTail, pairTail), JoinedList::merge);
    }

JoinedList is an Iterable tool that allows me to build a list fluently; it also stores the connector which is an enum {AND,OR}.

I tried to use .atLeastOnceDelimitedBy(), but that felt awkard. How would you parse it, most-idiomatically, with dot-parse?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions