Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment/Draft: Search engine #1986

Draft
wants to merge 48 commits into
base: develop
Choose a base branch
from

Commits on Nov 4, 2021

  1. Introduce an event dispatcher

    Introduce package event with a dispatcher subsystem. Add this to the
    manager and to the plugin subsystem.
    
    Whenever the plugin subsystem execute a PostHook, we dispatch an
    Change event on the event dispatcher bus. This currently has no
    effect, but allows us to register subsystems on the event bus for
    further processing. In particular, search.
    
    By design, we opt to hook the plugin system and pass to the event bus
    for now. One, it makes it easier to remove again, and two, the context
    handling inside the plugin subsystem doesn't want to live on the
    other side of an event bus.
    
    While here, write a test for the dispatcher code.
    SmallCoccinelle committed Nov 4, 2021
    Configuration menu
    Copy the full SHA
    73b89e8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    44cd993 View commit details
    Browse the repository at this point in the history
  3. Introduce a rollup service and search engine

    The rollup service turns events into batches for processing. The search
    engine wraps the rollup engine, making it private to the search
    subsystem.
    SmallCoccinelle committed Nov 4, 2021
    Configuration menu
    Copy the full SHA
    2236345 View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2021

  1. Configuration menu
    Copy the full SHA
    352fe5c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9ddd308 View commit details
    Browse the repository at this point in the history
  3. Introduce Query.search(..) in GraphQL.

    Add a search experiment to the code:
    
    Schema in GraphQL is extended with an early search system.
    
    Engine is extended with search, and gets passed through the resolver.
    
    Some conversion is currently done to glue things together, but that
    structure is ugly and needs some improvementificationism.
    SmallCoccinelle committed Nov 5, 2021
    Configuration menu
    Copy the full SHA
    c922b2d View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2021

  1. Introduce go 1.18s strings.Cut

    In go 1.18 strings.Cut becomes a reality. However, since it is such
    a useful tool, add it to the utils package for now. Once we are on
    go 1.18, we can replace utils.Cut with strings.Cut
    SmallCoccinelle committed Nov 6, 2021
    Configuration menu
    Copy the full SHA
    550a061 View commit details
    Browse the repository at this point in the history
  2. Protect indexes by a mutex.

    This is almost not needed, but to be safe, add the ability to protect
    changes to the engine, and lock most usage via an RLock().
    SmallCoccinelle committed Nov 6, 2021
    Configuration menu
    Copy the full SHA
    8d5b419 View commit details
    Browse the repository at this point in the history
  3. Flesh out search results

    Search results are Connection objects. Wrap each result in a contextual
    object. This can be used for scoring/highligting/facets later.
    
    Introduce interface SearchResultItem. Implement the interface for
    models.Scene. Add hydration code for scenes.
    SmallCoccinelle committed Nov 6, 2021
    Configuration menu
    Copy the full SHA
    bb863c7 View commit details
    Browse the repository at this point in the history
  4. Add scores to search result, flesh out items

    Add scores into search results. Move Search-internal NodeIDs into the
    search system.
    
    Introduce search.Item which protects the rest of the system against
    search-specific structures. Simplify hydration since it can now use
    search.Item.
    SmallCoccinelle committed Nov 6, 2021
    Configuration menu
    Copy the full SHA
    4f53631 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2021

  1. Add the facet experiment

    This experiment tells us facets want to be an input type rather than
    the current enum of predefined facets.
    SmallCoccinelle committed Nov 7, 2021
    Configuration menu
    Copy the full SHA
    a51ab18 View commit details
    Browse the repository at this point in the history
  2. Introduce some early reindexing code

    Reindexing of scenes at the moment, because that's what we have. The
    core idea is fairly simple: batch-process a table, a 1000 entries
    at a time, index them. Replace the data loader every 10 rounds
    (10k entries) so it doesn't grow too big.
    
    While reindexing is ongoing, the online changemap is still being built
    in the background. If reindexing takes more than the timer ticker,
    it will fire immediately after. If reindexing takes more than twice
    the timer ticker, the ticker protects against this and only fires once.
    SmallCoccinelle committed Nov 7, 2021
    Configuration menu
    Copy the full SHA
    77b91ea View commit details
    Browse the repository at this point in the history
  3. Rename changeMap -> changeSet

    It is really a set of changes. The map used to implement the set is an
    implementation detail that shouldn't be part of the name.
    SmallCoccinelle committed Nov 7, 2021
    Configuration menu
    Copy the full SHA
    e947cb1 View commit details
    Browse the repository at this point in the history
  4. Documentation nit

    SmallCoccinelle committed Nov 7, 2021
    Configuration menu
    Copy the full SHA
    95ef1de View commit details
    Browse the repository at this point in the history
  5. Improve reporting ergonomics.

    Pull stat tracking outward. Set up a reporting ticker and use it for
    reporting progress. This rolls up the log lines into something a bit
    more comprehensible.
    SmallCoccinelle committed Nov 7, 2021
    Configuration menu
    Copy the full SHA
    becf427 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    81ed4e9 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2021

  1. Configuration menu
    Copy the full SHA
    0e564bd View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2021

  1. Configuration menu
    Copy the full SHA
    0b90a64 View commit details
    Browse the repository at this point in the history
  2. Implement performer search.

    Change the schema to support performer searches. Performers are
    SearchResultItems. Make the search type optional, default to searching
    everything.
    
    Enable hydration of performers.
    
    Add performers to the data loader code.
    
    Introduce a performer document for the search index.
    
    Load performers before loading scenes, to utilize the dataloader
    cache maximally.
    
    When considering scenes, find the needed performers, and prime the
    cache with them.
    
    When processing scenes, denormalize the performer into the scene.
    SmallCoccinelle committed Nov 9, 2021
    Configuration menu
    Copy the full SHA
    7bc1140 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c79e086 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c6e1c8a View commit details
    Browse the repository at this point in the history
  5. Doc.

    SmallCoccinelle committed Nov 9, 2021
    Configuration menu
    Copy the full SHA
    b1d7600 View commit details
    Browse the repository at this point in the history
  6. Push preprocessing into batch processing.

    If we update performers, all scenes those performers are in should
    also change. Push this in.
    
    Currently, we over-apply on a full reindex, which can be fixed later,
    perhaps by moving preprocessing upward, or by having a flag on the
    batch processing layer. It's plenty fast right now though.
    SmallCoccinelle committed Nov 9, 2021
    Configuration menu
    Copy the full SHA
    de5de12 View commit details
    Browse the repository at this point in the history

Commits on Nov 10, 2021

  1. Implement Tag, resolve nil scenes

    Plug a hole with scenes that can be nil.
    SmallCoccinelle committed Nov 10, 2021
    Configuration menu
    Copy the full SHA
    d55670c View commit details
    Browse the repository at this point in the history
  2. More nil robustness.

    SmallCoccinelle committed Nov 10, 2021
    Configuration menu
    Copy the full SHA
    4cc9c44 View commit details
    Browse the repository at this point in the history
  3. Move pre-processing into the changeset.

    This change anticipates far better batch processing in the future.
    By explicitly preprocessing, we can do this in the online processing
    loop, but avoid it in the offline processing loop. This will avoid
    processing elements twice.
    SmallCoccinelle committed Nov 10, 2021
    Configuration menu
    Copy the full SHA
    0b93723 View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2021

  1. Support tags.

    Early tag support setup.
    SmallCoccinelle committed Nov 11, 2021
    Configuration menu
    Copy the full SHA
    6914c1c View commit details
    Browse the repository at this point in the history
  2. Introduce tags more flattened into scenes

    People will expect a tag to be fairly easy to grab. So prefer a direct
    encoding over a nested subdocument. This allows a search for
    `tag:woodworking` rather than `tag.name:woodworking`.
    
    While here, add the tag ids into the scene document as well. This will
    help with deletion.
    SmallCoccinelle committed Nov 11, 2021
    Configuration menu
    Copy the full SHA
    2839718 View commit details
    Browse the repository at this point in the history
  3. Move changesets into their own file

    Changesets will keep growing.
    SmallCoccinelle committed Nov 11, 2021
    Configuration menu
    Copy the full SHA
    dc32d31 View commit details
    Browse the repository at this point in the history
  4. Handle proper performer deletion

    Implement Stringer formatting for event.Change.
    
    Introduce engine_preprocess.go. Move preprocessing code into the engine
    itself. Use the engine to pull data which needs a change on a performer
    deletion. Rework changeset into changeset code only.
    SmallCoccinelle committed Nov 11, 2021
    Configuration menu
    Copy the full SHA
    9834a79 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    63a2933 View commit details
    Browse the repository at this point in the history

Commits on Nov 12, 2021

  1. Add tag preprocessing.

    Code is a bit spammy at the moment with logging, but that will be
    fixed at some point.
    SmallCoccinelle committed Nov 12, 2021
    Configuration menu
    Copy the full SHA
    a769d89 View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2021

  1. Configuration menu
    Copy the full SHA
    0e7d9ca View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2021

  1. Ready ourselves for handling studios

    Introduce studios
    
    * In data loading
    * In search documents
    * In changesets
    * In the search path
    * In the GraphQL schema
    
    No functional indexing yet.
    SmallCoccinelle committed Nov 15, 2021
    Configuration menu
    Copy the full SHA
    84ef060 View commit details
    Browse the repository at this point in the history

Commits on Nov 16, 2021

  1. Simplify full reindexing, full reindex studios

    The strategy is to fold reindexing into a worklist which we
    process through systematically. This reduces the full reindexer
    into a single loop, which then collapses the code to a far simpler
    code path, where the only variance is a switch on the document type.
    
    Use this new strategy to handle studios as well for full reindexing.
    SmallCoccinelle committed Nov 16, 2021
    Configuration menu
    Copy the full SHA
    5f03ace View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    edc03f5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0c8b48f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8201e36 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    016fd6e View commit details
    Browse the repository at this point in the history

Commits on Nov 17, 2021

  1. Split and simplify batch processing

    Rather than having a single large function, split the work
    into smaller functions and let the function names describe what
    is being done. This should make the code more local and easier to
    read.
    SmallCoccinelle committed Nov 17, 2021
    Configuration menu
    Copy the full SHA
    64f8927 View commit details
    Browse the repository at this point in the history
  2. Documentation.

    SmallCoccinelle committed Nov 17, 2021
    Configuration menu
    Copy the full SHA
    a429f17 View commit details
    Browse the repository at this point in the history
  3. Index studios in scenes. More types.

    Introduce indexing of studios in scenes.
    
    Introduce documents.DocType to properly type the documents as an enum.
    SmallCoccinelle committed Nov 17, 2021
    Configuration menu
    Copy the full SHA
    249771b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    bd6b48d View commit details
    Browse the repository at this point in the history

Commits on Nov 18, 2021

  1. More doc.

    SmallCoccinelle committed Nov 18, 2021
    Configuration menu
    Copy the full SHA
    2256de0 View commit details
    Browse the repository at this point in the history

Commits on Nov 25, 2021

  1. Configuration menu
    Copy the full SHA
    285e899 View commit details
    Browse the repository at this point in the history
  2. Remove facets for now

    Facets are going to be a thing we add later on. An MVP doesn't need
    facets, and we can remove lots of complexity if we don't have to worry
    about them right now.
    SmallCoccinelle committed Nov 25, 2021
    Configuration menu
    Copy the full SHA
    68c09c4 View commit details
    Browse the repository at this point in the history

Commits on Nov 30, 2021

  1. Configuration menu
    Copy the full SHA
    e398c60 View commit details
    Browse the repository at this point in the history
  2. Fold merge postHooks into their events

    If a merge is called, we should process all sources and the destination.
    Create an event ofr each of these.
    SmallCoccinelle committed Nov 30, 2021
    Configuration menu
    Copy the full SHA
    f269f04 View commit details
    Browse the repository at this point in the history