Skip to content

Speed-up Xapian searches by preloading indexes #617

Open

Description

#418 has shown that the typical steps for a search are:

  1. Read the zim file (to be able to locate the xapian index in it) : Cold : 7.44s | Warm : 0.12s
  2. Open the xapian database (internal xapian code) : Cold : 0.09s | Warm : 0.003s
  3. Set the enquire on the database : Cold : 0.02s | Warm: 0.0004s
  4. Run the enquire and get a set of (ranged) results from the enquire (internal xapian code) : Cold : 3.74s | Warm : 1.5s

Here is when it happens:

  1. Once at file opening
  2. At first search requested and then cached
  3. At each search
  4. At each search

In a attempt to speed-up searches (in particular the first one) the idea would be to have the following workflow:

  1. Once at file opening
  2. Once at file opening (optional?) and then cached
  3. At file opening and then keep one (more?) ready to go all the time
  4. At each search

He would be the related questions on my side:

  • Can we secure that 2. does not slows down the opening of the file (so it should run in an other thread)?
  • Can we secure that 3. does not slows down the searches (so it should run in an other thread)?
  • I guess the whole search system is protected to avoid two search requests to happen at the same time. If this is secure in a multithreaded context. This will be responsible of massive slow downs in many search requests happen at the same time. Would that be possible/reasonable to have a pull of "searcher"?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions