Skip to content

StandardQueryParser is over 100 times slower in v5 compared to v3 [LUCENE-7260] #8315

Open
@asfimport

Description

@asfimport

The following test code times parsing a large query.

import org.apache.lucene.analysis.KeywordAnalyzer;
//import org.apache.lucene.analysis.core.KeywordAnalyzer;
import org.apache.lucene.queryParser.standard.StandardQueryParser;
//import org.apache.lucene.queryparser.flexible.standard.StandardQueryParser;
import org.apache.lucene.search.BooleanQuery;

public class LargeQueryTest {
    public static void main(String[] args) throws Exception {
        BooleanQuery.setMaxClauseCount(50_000);
        StringBuilder builder = new StringBuilder(50_000*10);
        builder.append("id:( ");
        boolean first = true;
        for (int i = 0; i < 50_000; i++) {
            if (first) {
                first = false;
            } else {
                builder.append(" OR ");
            }
            builder.append(String.valueOf(i));
        }
        builder.append(" )");
        String queryString = builder.toString();

        StandardQueryParser parser2 = new StandardQueryParser(new KeywordAnalyzer());

        for (int i = 0; i < 10; i++) {
            long t0 = System.currentTimeMillis();
            parser2.parse(queryString, "nope");
            long t1 = System.currentTimeMillis();
            System.out.println(t1-t0);
        }
    }
}

For Lucene 3.6.2, the timings settle down to 200~300 with the fastest being 207.
For Lucene 5.4.1, the timings settle down to 20000~30000 with the fastest being 22444.

So at some point, some change made the query parser 100 times slower. I would suspect that it has something to do with how the list of children is now handled. Every time someone gets the children, it copies the list. Every time someone sets the children, it walks through to detach parent references and then reattaches them all again.

If it were me, I would probably make these collections immutable so that I didn't have to defensively copy them.


Migrated from LUCENE-7260 by Trejkaz, 1 vote, updated Feb 16 2017
Environment:

Java 8u51

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions