Implement an improved global search engine

### NetBox version

v3.3.4

### Feature type

Change to existing functionality

### Proposed functionality

This issue is an evolution of the proposal initially outlined in #7016 to improve NetBox's global search functionality, in conjunction with dynamic registration per #8927. This proposal suggests the introduction of a new global search cache, allowing a single database per search query. As this is a fairly complex topic, I've outlined some core areas of focus below. While not a complete implementation plan, it should be sufficient to get started and generate additional discussion.

### Caching

Each registered model will declare which of its fields should be cached for search. (This could be done similar to what we currently do with `clone_fields` and the `clone()` method.) Fields would be prescribed by name and numeric weighting. For example:

```python
class Book(NetBoxModel):
    title = models.CharField()
    isbn = models.PositiveBigIntegerField()
    author = models.CharField()
    description = models.CharField()

    search_fields = (
        ('title', 200),
        ('isbn', 200),
        ('author', 150),
        ('description', 100),
    )
    
    def cache(self):
        data = []
        for field_name, weight in getattr(self, 'search_fields', []):
            field = self._meta.get_field(field_name)
            value = field.value_from_object(self)
            if field_value not in (None, ''):
                data.append(SearchResult(object=self, field=field_name, value=value, weight=weight))
        return data
```

On `save()`, the model's `cache()` method (name TBD) would be called via a `post_save` signal handler to generate cache data. A new low-priority background task would then be created to feed this data into the search results table. (Any existing results for the referenced object would first be deleted.) Similarly, search results will be automatically deleted in response to a `post_delete` signal.

### Database Schema

Cached search data from all models would be written to a single table:

| Field | Type | Description |
|-------|------|-------------|
| timestamp | Datetime | Timestamp of most recent update |
| object_type | FK(ContentType) | GenericForeignKey component |
| object_id   | Integer | GenericForeignKey component |
| field | Char | Name of the field/attribute being cached |
| value | Char | Cached value |
| weight | Weight | Numeric weight assigned to the field |

The `object_type` and `object_id` fields would serve a GenericForeignKey named `object`, which references the cached object.

A populated table might look like this:

| timestamp       | object_type | object_id | field            | value                  | weight |
|-----------------|-------------|-----------|------------------|------------------------|--------|
| 2022-09-15 1:23 | dcim.Device | 441       | name             | akron-rtr1             | 200    |
| 2022-09-15 1:23 | dcim.Device | 441       | serial           | A4890274               | 180    |
| 2022-09-15 1:23 | dcim.Device | 441       | asset_tag        | H302R8E                | 180    |
| 2022-09-15 1:23 | dcim.Device | 441       | comments         | Some text goes here    | 50     |
| 2022-09-15 3:08 | dcim.Site   | 17        | name             | Akron                  | 200    |
| 2022-09-15 3:08 | dcim.Site   | 17        | facility         | us-oh-akron01          | 150    |
| 2022-09-15 3:08 | dcim.Site   | 17        | description      | Primary DC for US-East | 50     |
| 2022-09-15 3:08 | dcim.Site   | 17        | physical_address | 123 Fake St Akron OH   | 80     |

Searching for "akron" would return four rows. We can append `.distinct('object_type', 'object_id')` to ensure only a single row is returned per object, and we can use `.order_by('-weight')` to favor the most important result for each object. (We might further order by object type for consistency among objects with identical weights.)

#### Matching Logic

We could potentially add an `exact` boolean column to the table, indicating whether each result requires an exact (vs. partial) match. This could be useful for e.g. integer values, where partial matching is typically of little value. For example, we might only want to find exact matches for a device's `serial` or `asset_tag` values. Such a query would look like this:

```
SearchResult.objects.filter(Q(value__iexact='foo') | Q(exact=False, value__icontains='foo'))
```

It remains to be seen what the performance penalty of this approach looks like. We could also expose exactness as a toggle, enabling the user to search only for exact matches.

### Displaying Results

Each matching result will include several attributes:

* The object referenced (with a link)
* The field name on which the match occurred
* The field value, or matched portion of the value

These can be displayed to the user to convey a succinct understanding of why each object was included in the results. Although resolving the object required a GenericForeignKey lookup, this should be automatically reduced via `prefetch_related()` to a single additional query per type of object returned.

### Handling Model Migrations

Some housekeeping will be necessary to delete from the cache search results which reference fields which have been removed from their models. We should be able to hook into the [`post_migrate` signal](https://docs.djangoproject.com/en/4.1/ref/signals/#post-migrate) to detect when migrations have been applied, and bulk delete any entries which reference a field that is no longer referenced under its model's `search_fields` attribute. A similar approach may be used to detect removed models (e.g. because a plugin was uninstalled).

### Considerations

#### Advantages

* No new PostgreSQL extensions or other dependencies are introduced.
* Both native and plugin models can be registered automatically.
* All results can be returned via a single SQL query, and rendered as a single table.
* An unlimited number of results can be paginated.
* The search logic to be applied can be specified by the user (e.g. exact match vs. partial match vs. "starts with," etc.).
* Employing a background worker ensures that the caching of new results does not impact real-time operations.
* The entire search cache can be rebuilt offline if needed.
* The `timestamp` column can be compared against an object's `last_updated` time to identify stale results.

#### Disadvantages

* We're essentially inventing our own search engine (humble as it may be).
* This approach does not solve for matching by related objects, however it should be noted that this generally not a feature of the current search function, and likely is a reasonable compromise.
* Using a single large table for all results may degrade search performance over time as object counts increase.


### Use case

The overall goal here is to provide more robust general purpose search for both core NetBox models as well as those delivered via plugins. Performance, while important, is probably less important than implementing a consistent, flexible search engine.

### Database changes

Defined above for clarity

### External dependencies

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement an improved global search engine #10560

NetBox version

Feature type

Proposed functionality

Caching

Database Schema

Matching Logic

Displaying Results

Handling Model Migrations

Considerations

Advantages

Disadvantages

Use case

Database changes

External dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Type	Description
timestamp	Datetime	Timestamp of most recent update
object_type	FK(ContentType)	GenericForeignKey component
object_id	Integer	GenericForeignKey component
field	Char	Name of the field/attribute being cached
value	Char	Cached value
weight	Weight	Numeric weight assigned to the field

timestamp	object_type	object_id	field	value	weight
2022-09-15 1:23	dcim.Device	441	name	akron-rtr1	200
2022-09-15 1:23	dcim.Device	441	serial	A4890274	180
2022-09-15 1:23	dcim.Device	441	asset_tag	H302R8E	180
2022-09-15 1:23	dcim.Device	441	comments	Some text goes here	50
2022-09-15 3:08	dcim.Site	17	name	Akron	200
2022-09-15 3:08	dcim.Site	17	facility	us-oh-akron01	150
2022-09-15 3:08	dcim.Site	17	description	Primary DC for US-East	50
2022-09-15 3:08	dcim.Site	17	physical_address	123 Fake St Akron OH	80

Implement an improved global search engine #10560

Description

NetBox version

Feature type

Proposed functionality

Caching

Database Schema

Matching Logic

Displaying Results

Handling Model Migrations

Considerations

Advantages

Disadvantages

Use case

Database changes

External dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions