Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create the backend search index generation for documentation pages #459

Merged

Conversation

caendesilva
Copy link
Member

About

Adds an action which generates a search index for documentation pages used for the HydeSearch plugin

How it works

The way it currently works is that it takes a plain-text representation of all Markdown documentation pages and adds them paired with metadata such as the page title and URI path.

It is pretty slow to generate, so I have not yet added it to the hyde build command. Instead, it's generated using php hyde build:search.

The reason it's slow, is that each Markdown page needs to be parsed into a Page model, then the Markdown body is compiled to HTML as I found that to be the easiest way to convert it to plain text (using regex).

Benchmarks

These are not at all proper benchmarks, but an approximate of how long it took on my machine.

Generating a search index for the entire Hyde documentation took around 930ms and the resulting JSON is about 80KB.

Generating a search index for the entire Alice's Adventures in Wonderland book (where each chapter is a Markdown file) took around 1300ms. The JSON is around 148kB. When testing in production, only 55.2 kB is sent over the air, which is then cacheable resulting in a browser load time of about 2ms.

Try it out

Try a live demo of the frontend implementation using the entire Wonderland book:
https://demos.desilva.se/gist/github/hydephp/experiments/hydesearch/

@caendesilva caendesilva linked an issue May 29, 2022 that may be closed by this pull request
5 tasks
@caendesilva
Copy link
Member Author

From a PHPDoc I wrote while working on the search content generation:

 * There are a few ways we could go about this. The goal is to allow the user
 * to run a free-text search to find relevant documentation pages.
 *
 * The easiest way to do this is by adding the Markdown body to the search index.
 * But this is of course not ideal as it may take an incredible amount of space
 * for large documentation sites. The Hyde docs weight around 80kb of JSON.
 *
 * Another option is to assemble all the headings in a document and use that
 * for the search basis. A truncated version of the body could also be included.
 *
 * A third option which might be the most space efficient (besides from just
 * adding titles, which doesn't offer much help to the user since it is just
 * a filterable sidebar at that point), would be to search for keywords
 * in the document. This would however add complexity as well as extra
 * computing time.
 *
 * Benchmarks: (for official Hyde docs)
 *
 * Returning $document->body as is: 500ms
 * Returning $document->body as Str::markdown(): 920ms + 10ms for regex

@codecov-commenter
Copy link

Codecov Report

Merging #459 (3b38d1e) into master (9166548) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #459   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            1         1           
  Lines           11        11           
=========================================
  Hits            11        11           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9166548...3b38d1e. Read the comment docs.

@caendesilva caendesilva merged commit 4fd0714 into master May 29, 2022
@caendesilva caendesilva deleted the 447-add-simple-optional-search-feature-to-documentation-pages branch May 29, 2022 05:19
@caendesilva caendesilva restored the 447-add-simple-optional-search-feature-to-documentation-pages branch May 29, 2022 05:20
caendesilva pushed a commit that referenced this pull request Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add simple (optional?) search feature to documentation pages
3 participants