Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] can we avoid loading grammar generator for every reified type? #1882

Closed
DavyLandman opened this issue Nov 2, 2023 · 5 comments

Comments

@DavyLandman
Copy link
Member

If we type in the following:

rascal>data X = xx();
ok
rascal>#X
Loading parser generator
Loading parser generator
Loading module |std:///lang/rascal/grammar/ParserGenerator.rsc|
Loading module |std:///lang/rascal/grammar/definition/Regular.rsc|
Loading module |std:///Grammar.rsc|
Loading module |std:///ParseTree.rsc|
Loading module |std:///Message.rsc|
Loading module |std:///Type.rsc|
Loading module |std:///List.rsc|
Loading module |std:///Map.rsc|
Loading module |std:///IO.rsc|
Loading module |std:///Exception.rsc|
Loading module |std:///lang/rascal/grammar/definition/Productions.rsc|
Loading module |std:///lang/rascal/syntax/Rascal.rsc|
Loading module |std:///lang/rascal/grammar/definition/Attributes.rsc|
Loading module |std:///lang/rascal/grammar/definition/Literals.rsc|
Loading module |std:///String.rsc|
Loading module |std:///ValueIO.rsc|
Loading module |std:///util/Maybe.rsc|
Loading module |std:///lang/rascal/grammar/definition/Symbols.rsc|
Loading module |std:///lang/rascal/grammar/definition/Characters.rsc|
Loading module |std:///lang/rascal/grammar/definition/Names.rsc|
Loading module |std:///lang/rascal/grammar/definition/Modules.rsc|
Loading module |std:///lang/rascal/grammar/definition/Layout.rsc|
Loading module |std:///Set.rsc|
Loading module |std:///util/Math.rsc|
Loading module |std:///lang/rascal/grammar/definition/Keywords.rsc|
Loading module |std:///Node.rsc|
Loading module |std:///lang/rascal/grammar/definition/Parameters.rsc|
Loading module |std:///lang/rascal/grammar/definition/Priorities.rsc|
Loading module |std:///lang/rascal/format/Grammar.rsc|
Loading module |std:///analysis/grammars/Dependency.rsc|
Loading module |std:///analysis/graphs/Graph.rsc|
Loading module |std:///Relation.rsc|
Loading module |std:///lang/rascal/format/Escape.rsc|
Loading module |std:///lang/rascal/grammar/definition/References.rsc|
Loading module |std:///lang/rascal/grammar/ConcreteSyntax.rsc|
Loading module |std:///util/Monitor.rsc|
Loading module |std:///lang/rascal/grammar/Lookahead.rsc|
Loading module |std:///lang/rascal/grammar/ConcreteSyntax.rsc|
Loading module |std:///lang/rascal/grammar/definition/Modules.rsc|
Loading module |std:///lang/rascal/grammar/definition/Priorities.rsc|
Loading module |std:///lang/rascal/grammar/definition/Regular.rsc|
Loading module |std:///lang/rascal/grammar/definition/Keywords.rsc|
Loading module |std:///lang/rascal/grammar/definition/Literals.rsc|
Loading module |std:///lang/rascal/grammar/definition/Parameters.rsc|
Loading module |std:///lang/rascal/grammar/definition/Symbols.rsc|
Loading module |std:///analysis/grammars/Ambiguity.rsc|
type[X]: type(
  adt(
    "X",
    []),
  (adt(
      "X",
      []):choice(
      adt(
        "X",
        []),
      {cons(
          label(
            "xx",
            adt(
              "X",
              [])),
          [],
          [],
          {})})))
rascal>

we see that the parser generator is loaded before the #X (aka reified type) is calculated. Is it possible to tune that? Looking at this section:

if (!t.isTop()) {
// if t == value then let's not call the parser generator.
// the reason is that #value occurs in the parser generator itself
// so this would trigger an infinite cascade of parser generators loading
// each other
gr = (IMap) eval.getEvaluator().getGrammar(eval.getCurrentEnvt()).get("rules");
}

It looks like only value is excluded, but lower down we already dispatch on if it's a grammar type:

if (t instanceof NonTerminalType) {
value = ((IRascalValueFactory) eval.getValueFactory()).reifiedType(((NonTerminalType) t).getSymbol(), gr);
}
else {
value = new TypeReifier(eval.__getVf()).typeToValue(t, eval.getCurrentEnvt().getStore(), gr);
}

I might be missing something, but if the type to be reified is not a grammar type, do we need to load the parser generator to calculate the grammars in the module? Or is this due to the nested grammars that might be in there? like #list[A] where A is a lexical?

@jurgenvinju
Copy link
Member

Do we need this? It will be a very impactful fix including the duplication of quite some code. I'd rather wait for the compiled version which does not even have this as a run-time feature because it is a compile-time constant.

@jurgenvinju
Copy link
Member

Or in other words: the compiler avoids this, so we could say "fixed" ;-)

@DavyLandman
Copy link
Member Author

Well, it's a thing I'm seeing at our customers, that all the effort of removing parsergenerators being loaded is hurt at a different point where we trigger readBinaryFile(#TModel...

So I was wondering: can we move that loading of the grammar inside the if for only the NonTerminalTypes. but if it's more complex than that, let's leave it. at least it's not happening at import time.

@jurgenvinju
Copy link
Member

Ah I see. Type reification is a complex operation which includes resolving all the imported and extended modules, finding out which types are dependent on each other through grammars and data-types and aliases, etc. etc.

If readBinaryFile actually uses the grammars in the reified type then we can't do without, but if it doesn't and gets the definitions from the binary, then if (TModel m := readBinaryFile(type(adt("TModel,[]), (), ...))) would avoid the use of reification.

@jurgenvinju
Copy link
Member

jurgenvinju commented Nov 2, 2023

The grammar collection code can only be avoided if we know that the reified type literal does not refer to any non-terminals. Such an analysis is expensive, because it depends on all the declarations in all the visible scopes, and it would be additive in the case where indeed there is an non-terminal present (like in the case of a TModel).

So I don't see a way of avoiding this code and having a positive impact on performance without re-implementing the grammar collection code in Java. This is a big piece of complex Rascal code so I don't think we'll be able to do this. And this entire cost will go away with the compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants