Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment/Draft: Search engine #1986

Draft
wants to merge 48 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
73b89e8
Introduce an event dispatcher
SmallCoccinelle Nov 4, 2021
44cd993
Also start the dispatcher.
SmallCoccinelle Nov 4, 2021
2236345
Introduce a rollup service and search engine
SmallCoccinelle Nov 4, 2021
352fe5c
Introduce a scene data loader.
SmallCoccinelle Nov 5, 2021
9ddd308
Introduce a Scene Index.
SmallCoccinelle Nov 5, 2021
c922b2d
Introduce Query.search(..) in GraphQL.
SmallCoccinelle Nov 5, 2021
550a061
Introduce go 1.18s strings.Cut
SmallCoccinelle Nov 6, 2021
8d5b419
Protect indexes by a mutex.
SmallCoccinelle Nov 6, 2021
bb863c7
Flesh out search results
SmallCoccinelle Nov 6, 2021
4f53631
Add scores to search result, flesh out items
SmallCoccinelle Nov 6, 2021
a51ab18
Add the facet experiment
SmallCoccinelle Nov 7, 2021
77b91ea
Introduce some early reindexing code
SmallCoccinelle Nov 7, 2021
e947cb1
Rename changeMap -> changeSet
SmallCoccinelle Nov 7, 2021
95ef1de
Documentation nit
SmallCoccinelle Nov 7, 2021
becf427
Improve reporting ergonomics.
SmallCoccinelle Nov 7, 2021
81ed4e9
Add search stats into the graphql result.
SmallCoccinelle Nov 7, 2021
0e564bd
Derive "year" from "date". Support quick year queries.
SmallCoccinelle Nov 8, 2021
0b90a64
Add generated dataloaders
SmallCoccinelle Nov 9, 2021
7bc1140
Implement performer search.
SmallCoccinelle Nov 9, 2021
c79e086
Dead Code Elimination
SmallCoccinelle Nov 9, 2021
c6e1c8a
ChangeMap -> ChangeSet
SmallCoccinelle Nov 9, 2021
b1d7600
Doc.
SmallCoccinelle Nov 9, 2021
de5de12
Push preprocessing into batch processing.
SmallCoccinelle Nov 9, 2021
d55670c
Implement Tag, resolve nil scenes
SmallCoccinelle Nov 10, 2021
4cc9c44
More nil robustness.
SmallCoccinelle Nov 10, 2021
0b93723
Move pre-processing into the changeset.
SmallCoccinelle Nov 10, 2021
6914c1c
Support tags.
SmallCoccinelle Nov 11, 2021
2839718
Introduce tags more flattened into scenes
SmallCoccinelle Nov 11, 2021
dc32d31
Move changesets into their own file
SmallCoccinelle Nov 11, 2021
9834a79
Handle proper performer deletion
SmallCoccinelle Nov 11, 2021
63a2933
Move engine reindexing to its own file
SmallCoccinelle Nov 11, 2021
a769d89
Add tag preprocessing.
SmallCoccinelle Nov 12, 2021
0e7d9ca
Add an event tracker while developing this
SmallCoccinelle Nov 14, 2021
84ef060
Ready ourselves for handling studios
SmallCoccinelle Nov 15, 2021
5f03ace
Simplify full reindexing, full reindex studios
SmallCoccinelle Nov 16, 2021
edc03f5
Remove a TODO which isn't relevant anymore
SmallCoccinelle Nov 16, 2021
0c8b48f
Resolve a couple of linting warnings
SmallCoccinelle Nov 16, 2021
8201e36
Index studios in batch processing
SmallCoccinelle Nov 16, 2021
016fd6e
Implement hydration of studios as well.
SmallCoccinelle Nov 16, 2021
64f8927
Split and simplify batch processing
SmallCoccinelle Nov 17, 2021
a429f17
Documentation.
SmallCoccinelle Nov 17, 2021
249771b
Index studios in scenes. More types.
SmallCoccinelle Nov 17, 2021
bd6b48d
Add studio preprocessing
SmallCoccinelle Nov 17, 2021
2256de0
More doc.
SmallCoccinelle Nov 18, 2021
285e899
Merge branch 'develop' into search-engine
SmallCoccinelle Nov 25, 2021
68c09c4
Remove facets for now
SmallCoccinelle Nov 25, 2021
e398c60
Merge branch 'develop' into search-engine
SmallCoccinelle Nov 30, 2021
f269f04
Fold merge postHooks into their events
SmallCoccinelle Nov 30, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
24 changes: 23 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,30 @@ require (
)

require (
github.com/blevesearch/bleve/v2 v2.2.2
github.com/vearutop/statigz v1.1.6
github.com/vektah/dataloaden v0.2.1-0.20190515034641-a19b9a6e7c9e
github.com/vektah/gqlparser/v2 v2.0.1
)

require (
github.com/RoaringBitmap/roaring v0.9.4 // indirect
github.com/agnivade/levenshtein v1.1.0 // indirect
github.com/antchfx/xpath v1.1.6 // indirect
github.com/bits-and-blooms/bitset v1.2.0 // indirect
github.com/blevesearch/bleve_index_api v1.0.1 // indirect
github.com/blevesearch/go-porterstemmer v1.0.3 // indirect
github.com/blevesearch/mmap-go v1.0.3 // indirect
github.com/blevesearch/scorch_segment_api/v2 v2.1.0 // indirect
github.com/blevesearch/segment v0.9.0 // indirect
github.com/blevesearch/snowballstem v0.9.0 // indirect
github.com/blevesearch/upsidedown_store_api v1.0.1 // indirect
github.com/blevesearch/vellum v1.0.7 // indirect
github.com/blevesearch/zapx/v11 v11.3.1 // indirect
github.com/blevesearch/zapx/v12 v12.3.1 // indirect
github.com/blevesearch/zapx/v13 v13.3.1 // indirect
github.com/blevesearch/zapx/v14 v14.3.1 // indirect
github.com/blevesearch/zapx/v15 v15.3.1 // indirect
github.com/chromedp/sysutil v1.0.0 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.0 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
Expand All @@ -62,6 +79,8 @@ require (
github.com/gobwas/pool v0.2.1 // indirect
github.com/gobwas/ws v1.1.0-rc.5 // indirect
github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e // indirect
github.com/golang/protobuf v1.5.2 // indirect
github.com/golang/snappy v0.0.3 // indirect
github.com/hashicorp/errwrap v1.0.0 // indirect
github.com/hashicorp/go-multierror v1.1.0 // indirect
github.com/hashicorp/golang-lru v0.5.1 // indirect
Expand All @@ -75,6 +94,7 @@ require (
github.com/mitchellh/mapstructure v1.4.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.1 // indirect
github.com/mschoch/smat v0.2.0 // indirect
github.com/nfnt/resize v0.0.0-20180221191011-83c6a9932646 // indirect
github.com/pelletier/go-toml v1.9.4 // indirect
github.com/pkg/errors v0.9.1 // indirect
Expand All @@ -85,14 +105,16 @@ require (
github.com/spf13/cast v1.4.1 // indirect
github.com/spf13/cobra v1.0.0 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/steveyen/gtreap v0.1.0 // indirect
github.com/stretchr/objx v0.2.0 // indirect
github.com/subosito/gotenv v1.2.0 // indirect
github.com/tidwall/match v1.1.1 // indirect
github.com/urfave/cli/v2 v2.1.1 // indirect
github.com/vektah/dataloaden v0.2.1-0.20190515034641-a19b9a6e7c9e // indirect
go.etcd.io/bbolt v1.3.5 // indirect
go.uber.org/atomic v1.7.0 // indirect
golang.org/x/mod v0.4.2 // indirect
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 // indirect
google.golang.org/protobuf v1.27.1 // indirect
gopkg.in/ini.v1 v1.63.2 // indirect
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect
)
Expand Down
60 changes: 60 additions & 0 deletions go.sum

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions graphql/schema/schema.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ type Query {

logs: [LogEntry!]!

# Search
search(query: String!, type: SearchType): SearchResultItemConnection!

# Scrapers

"""List available scrapers"""
Expand Down
2 changes: 1 addition & 1 deletion graphql/schema/types/performer.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ enum GenderEnum {
NON_BINARY
}

type Performer {
type Performer implements SearchResultItem {
id: ID!
checksum: String!
name: String
Expand Down
2 changes: 1 addition & 1 deletion graphql/schema/types/scene.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ type SceneMovie {
scene_index: Int
}

type Scene {
type Scene implements SearchResultItem {
id: ID!
checksum: String
oshash: String
Expand Down
31 changes: 31 additions & 0 deletions graphql/schema/types/search.graphql
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
enum SearchType {
SEARCH_SCENE
SEARCH_PERFORMER
SEARCH_TAG
SEARCH_STUDIO
}

type SearchResultItemConnection {
edges: [SearchItemEdge]

took: Float!
maxScore: Float!
total: Int!

status: SearchResultStatus
}

type SearchResultStatus {
successful: Int!
failed: Int!
total: Int!
}

type SearchItemEdge {
node: SearchResultItem
score: Float!
}

interface SearchResultItem {
id: ID!
}
2 changes: 1 addition & 1 deletion graphql/schema/types/studio.graphql
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
type Studio {
type Studio implements SearchResultItem {
id: ID!
checksum: String!
name: String!
Expand Down
2 changes: 1 addition & 1 deletion graphql/schema/types/tag.graphql
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
type Tag {
type Tag implements SearchResultItem {
id: ID!
name: String!
aliases: [String!]!
Expand Down
2 changes: 2 additions & 0 deletions pkg/api/resolver.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"github.com/stashapp/stash/pkg/models"
"github.com/stashapp/stash/pkg/plugin"
"github.com/stashapp/stash/pkg/scraper"
"github.com/stashapp/stash/pkg/search"
)

var (
Expand All @@ -32,6 +33,7 @@ type hookExecutor interface {
type Resolver struct {
txnManager models.TransactionManager
hookExecutor hookExecutor
searchEngine *search.Engine
}

func (r *Resolver) scraperCache() *scraper.Cache {
Expand Down
60 changes: 60 additions & 0 deletions pkg/api/resolver_query_search.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
package api

import (
"context"
"errors"
"fmt"

"github.com/stashapp/stash/pkg/models"
"github.com/stashapp/stash/pkg/search"
"github.com/stashapp/stash/pkg/search/documents"
)

var ErrUnknownType = errors.New("unknown item type")

func (r *queryResolver) Search(ctx context.Context, query string, ty *models.SearchType) (*models.SearchResultItemConnection, error) {
s, err := r.searchEngine.Search(ctx, query, ty)
if err != nil {
return nil, err
}

var edges []*models.SearchItemEdge
for _, item := range s.Items {
h, err := r.hydrate(ctx, item)
if err != nil {
edges = append(edges, nil)
continue
}

edges = append(edges, &models.SearchItemEdge{
Score: item.Score,
Node: h,
})
}

res := models.SearchResultItemConnection{
Edges: edges,
Took: s.Took.Seconds(),
MaxScore: s.MaxScore,
Total: int(s.Total),

Status: s.Status,
}

return &res, nil
}

func (r *queryResolver) hydrate(ctx context.Context, item search.Item) (models.SearchResultItem, error) {
switch item.Type {
case documents.TypeScene:
return r.FindScene(ctx, &item.ID, nil)
case documents.TypePerformer:
return r.FindPerformer(ctx, item.ID)
case documents.TypeTag:
return r.FindTag(ctx, item.ID)
case documents.TypeStudio:
return r.FindStudio(ctx, item.ID)
default:
return nil, fmt.Errorf("%w: %v", ErrUnknownType, item.Type)
}
}
2 changes: 2 additions & 0 deletions pkg/api/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,11 @@ func Start(uiBox embed.FS, loginUIBox embed.FS) {

txnManager := manager.GetInstance().TxnManager
pluginCache := manager.GetInstance().PluginCache
searchEngine := manager.GetInstance().Search
resolver := &Resolver{
txnManager: txnManager,
hookExecutor: pluginCache,
searchEngine: searchEngine,
}

gqlSrv := gqlHandler.New(models.NewExecutableSchema(models.Config{Resolvers: resolver}))
Expand Down
133 changes: 133 additions & 0 deletions pkg/event/dispatcher.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
// package event dispatches change-events in stash
package event

import (
"context"
"fmt"
"sync"
)

// ChangeType defines what has changed
type ChangeType int

const (
SceneMarker ChangeType = iota
Scene
Image
Gallery
Movie
Performer
Studio
Tag
)

func (ct ChangeType) String() string {
switch ct {
case SceneMarker:
return "SceneMarker"
case Scene:
return "Scene"
case Image:
return "Image"
case Gallery:
return "Gallery"
case Movie:
return "Movie"
case Performer:
return "Performer"
case Studio:
return "Studio"
case Tag:
return "Tag"
default:
panic("Unhandled ChangeType case")
}
}

// Change represents a pair of a Type and the ID which has changed
type Change struct {
Type ChangeType
ID int
}

func (c Change) String() string {
return fmt.Sprintf("%v(%v)", c.Type, c.ID)
}

// Dispatcher represents a single event dispatcher
type Dispatcher struct {
incoming chan Change
mu sync.Mutex
chans []chan Change
}

// NewDispatcher creates a new even dispatcher
func NewDispatcher() *Dispatcher {
incoming := make(chan Change, 1)
return &Dispatcher{
incoming: incoming,
}
}

// Start starts the dispatcher goroutine under the given context
func (d *Dispatcher) Start(ctx context.Context) {
go func() {
for {
select {
case <-ctx.Done():
return
case change := <-d.incoming:
d.broadcast(change)
}
}
}()
}

// Register registers chan for receiving events. It is up to the caller to ensure
// the channel doesn't block. I.e., events must be lifted from chan quickly.
func (d *Dispatcher) Register(c chan Change) {
d.mu.Lock()
defer d.mu.Unlock()

d.chans = append(d.chans, c)
}

// Unregister removes a channel for dispatches
func (d *Dispatcher) Unregister(c chan Change) {
d.mu.Lock()
defer d.mu.Unlock()

// Find the index at which the chan occur
idx := -1
for i := range d.chans {
if d.chans[i] == c {
idx = i
break
}
}

// If already unregistered, ignore
if idx == -1 {
return
}

// Swap chan to last element, cut it off
last := len(d.chans) - 1
d.chans[idx] = d.chans[last]
d.chans = d.chans[:last]
}

// Publish broadcasts a change
func (d *Dispatcher) Publish(c Change) {
d.incoming <- c
}

// broadcast fans-out a change to all channels registered
func (d *Dispatcher) broadcast(change Change) {
d.mu.Lock()
defer d.mu.Unlock()

for _, ch := range d.chans {
ch <- change
}
}
Loading