opensearch-project
diff --git a/‎_search-plugins/search-pipelines/rerank-processor.md‎
Lines changed: 97 additions & 21 deletions b/‎_search-plugins/search-pipelines/rerank-processor.md‎
Lines changed: 97 additions & 21 deletions
diff --git a/‎_search-plugins/search-relevance/rerank-by-field.md‎
Lines changed: 208 additions & 0 deletions b/‎_search-plugins/search-relevance/rerank-by-field.md‎
Lines changed: 208 additions & 0 deletions
@@ -11,33 +11,49 @@ grand_parent: Search pipelines
 Introduced 2.12
 {: .label .label-purple }
 
-The `rerank` search request processor intercepts search results and passes them to a cross-encoder model to be reranked. The model reranks the results, taking into account the scoring context. Then the processor orders documents in the search results based on their new scores.
+The `rerank` search response processor intercepts and reranks search results. The processor orders documents in the search results based on their new scores. 
+
+OpenSearch supports the following rerank types.
+
+Type | Description | Earliest available version
+:--- | :--- | :---
+[`ml_opensearch`](#the-ml_opensearch-rerank-type) | Applies an OpenSearch-provided cross-encoder model. | 2.12
+[`by_field`](#the-by_field-rerank-type) | Applies reranking based on a user-provided field. | 2.18
 
 ## Request body fields
 
 The following table lists all available request fields.
 
-Field | Data type | Description
-:--- | :--- | :---
-`<reranker_type>` | Object | The reranker type provides the rerank processor with static information needed across all reranking calls. Required.
-`context` | Object | Provides the rerank processor with information necessary for generating reranking context at query time.
-`tag` | String | The processor's identifier. Optional.
-`description` | String | A description of the processor. Optional.
-`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+Field | Data type | Required/Optional | Description
+:--- | :--- | :--- | :---
+`<rerank_type>` | Object | Required | The rerank type for document reranking. Valid values are `ml-opensearch` and `by_field`.
+`context` | Object |  Required for the `ml_opensearch` rerank type. Optional and does not affect the results for the `by_field` rerank type. | Provides the `rerank` processor with information necessary for reranking at query time. 
+`tag` | String | Optional | The processor's identifier.
+`description` | String | Optional | A description of the processor.
+`ignore_failure` | Boolean | Optional | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Default is `false`.
+
+<!-- vale off -->
+## The ml_opensearch rerank type
+<!-- vale on -->
+Introduced 2.12
+{: .label .label-purple }
 
-### The `ml_opensearch` reranker type
+To rerank results using a cross-encoder model, specify the `ml_opensearch` rerank type.
 
-The `ml_opensearch` reranker type is designed to work with the cross-encoder model provided by OpenSearch. For this reranker type, specify the following fields.
+### Prerequisite
+
+Before using the `ml_opensearch` rerank type, you must configure a cross-encoder model. For information about using an OpenSearch-provided model, see [Cross-encoder models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#cross-encoder-models). For information about using a custom model, see [Custom local models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/).
+
+The `ml_opensearch` rerank type supports the following fields. All fields are required.
 
 Field  | Data type | Description
 :--- | :---  | :--- 
-`ml_opensearch` | Object | Provides the rerank processor with model information. Required.
-`ml_opensearch.model_id` | String | The model ID for the cross-encoder model. Required. For more information, see [Using ML models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
-`context.document_fields` | Array | An array of document fields that specifies the fields from which to retrieve context for the cross-encoder model. Required.
+`ml_opensearch.model_id` | String | The model ID of the cross-encoder model for reranking. For more information, see [Using ML models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
+`context.document_fields` | Array | An array of document fields that specifies the fields from which to retrieve context for the cross-encoder model. 
 
-## Example 
+### Example 
 
-The following example demonstrates using a search pipeline with a `rerank` processor.
+The following example demonstrates using a search pipeline with a `rerank` processor implemented using the `ml_opensearch` rerank type. For a complete example, see [Reranking using a cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-cross-encoder/).
 
 ### Creating a search pipeline
 
@@ -108,11 +124,71 @@ POST /_search?search_pipeline=rerank_pipeline
 ```
 {% include copy-curl.html %}
 
-The `query_context` object contains the following fields. 
+The `query_context` object contains the following fields. You must provide either `query_text` or `query_text_path` but cannot provide both simultaneously.
+
+Field name | Required/Optional | Description
+:--- | :--- | :---  
+`query_text` | Exactly one of `query_text` or `query_text_path` is required. | The natural language text of the question that you want to use to rerank the search results. 
+`query_text_path` | Exactly one of `query_text` or `query_text_path` is required. | The full JSON path to the text of the question that you want to use to rerank the search results. The maximum number of characters allowed in the path is `1000`.
+
+
+<!-- vale off -->
+## The by_field rerank type
+<!-- vale on -->
+Introduced 2.18
+{: .label .label-purple }
+
+To rerank results by a document field, specify the `by_field` rerank type.
+
+The `by_field` object supports the following fields.
+
+Field  | Data type | Required/Optional | Description
+:--- | :---  | :--- | :--- 
+`target_field` | String | Required |  Specifies the field name or a dot path to the field containing the score to use for reranking. 
+`remove_target_field` | Boolean | Optional | If `true`, the response does not include the `target_field` used to perform reranking. Default is `false`.
+`keep_previous_score` | Boolean | Optional | If `true`, the response includes a `previous_score` field, which contains the score calculated before reranking and can be useful when debugging. Default is `false`.
+
+### Example 
+
+The following example demonstrates using a search pipeline with a `rerank` processor implemented using the `by_field` rerank type. For a complete example, see [Reranking by a document field]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field/).
+
+### Creating a search pipeline
+
+The following request creates a search pipeline with a `by_field` rerank type response processor that ranks the documents by the `reviews.stars` field and specifies to return the original document score:
+
+```json
+PUT /_search/pipeline/rerank_byfield_pipeline
+{
+  "response_processors": [
+    {
+      "rerank": {
+        "by_field": {
+          "target_field": "reviews.stars",
+          "keep_previous_score" : true
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+### Using the search pipeline
+
+To apply the search pipeline to a query, provide the search pipeline name in the query parameter:
+
+```json
+POST /book-index/_search?search_pipeline=rerank_byfield_pipeline
+{
+  "query": {
+     "match_all": {}
+  }
+}
+```
+{% include copy-curl.html %}
 
-Field name  | Description
-:--- | :---  
-`query_text` | The natural language text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required.
-`query_text_path` | The full JSON path to the text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required. The maximum number of characters in the path is `1000`.
+## Next steps
 
-For more information about setting up reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
+- Learn more about [reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
+- See a complete example of [reranking using a cross-encoder model]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-cross-encoder/).
+- See a complete example of [reranking by a document field]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/rerank-by-field/).
@@ -0,0 +1,208 @@
+---
+layout: default
+title: Reranking by a field
+parent: Reranking search results
+grand_parent: Search relevance
+has_children: false
+nav_order: 20
+---
+
+# Reranking search results by a field
+Introduced 2.18
+{: .label .label-purple }
+
+You can use a `by_field` rerank type to rerank search results by a document field. Reranking search results by a field is useful if a model has already run and produced a numerical score for your documents or if a previous search response processor was applied and you want to rerank documents differently based on an aggregated field.
+
+To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores obtained from a document field. 
+
+## Running a search with reranking
+
+To run a search with reranking, follow these steps:
+
+1. [Configure a search pipeline](#step-1-configure-a-search-pipeline).
+1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
+1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
+1. [Search using reranking](#step-4-search-using-reranking).
+
+## Step 1: Configure a search pipeline
+
+Configure a search pipeline with a [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) and specify the `by_field` rerank type. The pipeline sorts by the `reviews.stars` field (specified by a complete dot path to the field) and returns the original query scores for all documents along with their new scores:
+
+```json
+PUT /_search/pipeline/rerank_byfield_pipeline
+{
+  "response_processors": [
+    {
+      "rerank": {
+        "by_field": {
+          "target_field": "reviews.stars",
+          "keep_previous_score" : true
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+For more information about the request fields, see [Request fields]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#request-body-fields).
+
+## Step 2: Create an index for ingestion
+
+In order to use the `rerank` processor defined in your pipeline, create an OpenSearch index and add the pipeline created in the previous step as the default pipeline:
+
+```json
+PUT /book-index
+{
+  "settings": {
+    "index.search.default_pipeline" : "rerank_byfield_pipeline"
+  },
+  "mappings": {
+    "properties": {
+      "title": {
+        "type": "text"
+      },
+      "author": {
+        "type": "text"
+      },
+      "genre": {
+        "type": "keyword"
+      },
+      "reviews": {
+        "properties": {
+          "stars": {
+            "type": "float"
+          }
+        }
+      },
+      "description": {
+        "type": "text"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Step 3: Ingest documents into the index
+
+To ingest documents into the index created in the previous step, send the following bulk request:
+
+```json
+POST /_bulk
+{ "index": { "_index": "book-index", "_id": "1" } }
+{ "title": "The Lost City", "author": "Jane Doe", "genre": "Adventure Fiction", "reviews": { "stars": 4.2 }, "description": "An exhilarating journey through a hidden civilization in the Amazon rainforest." }
+{ "index": { "_index": "book-index", "_id": "2" } }
+{ "title": "Whispers of the Past", "author": "John Smith", "genre": "Historical Mystery", "reviews": { "stars": 4.7 }, "description": "A gripping tale set in Victorian England, unraveling a century-old mystery." }
+{ "index": { "_index": "book-index", "_id": "3" } }
+{ "title": "Starlit Dreams", "author": "Emily Clark", "genre": "Science Fiction", "reviews": { "stars": 4.5 }, "description": "In a future where dreams can be shared, one girl discovers her imaginations power." }
+{ "index": { "_index": "book-index", "_id": "4" } }
+{ "title": "The Enchanted Garden", "author": "Alice Green", "genre": "Fantasy", "reviews": { "stars": 4.8 }, "description": "A magical garden holds the key to a young girls destiny and friendship." }
+
+```
+{% include copy-curl.html %}
+
+## Step 4: Search using reranking
+
+As an example, run a `match_all` query on your index:
+
+```json
+POST /book-index/_search
+{
+  "query": {
+     "match_all": {}
+  }
+}
+```
+{% include copy-curl.html %}
+
+The response contains documents sorted in descending order based on the `reviews.starts` field. Each document contains the original query score in the `previous_score` field:
+
+```json
+{
+  "took": 33,
+  "timed_out": false,
+  "_shards": {
+    "total": 1,
+    "successful": 1,
+    "skipped": 0,
+    "failed": 0
+  },
+  "hits": {
+    "total": {
+      "value": 4,
+      "relation": "eq"
+    },
+    "max_score": 4.8,
+    "hits": [
+      {
+        "_index": "book-index",
+        "_id": "4",
+        "_score": 4.8,
+        "_source": {
+          "reviews": {
+            "stars": 4.8
+          },
+          "author": "Alice Green",
+          "genre": "Fantasy",
+          "description": "A magical garden holds the key to a young girls destiny and friendship.",
+          "previous_score": 1,
+          "title": "The Enchanted Garden"
+        }
+      },
+      {
+        "_index": "book-index",
+        "_id": "2",
+        "_score": 4.7,
+        "_source": {
+          "reviews": {
+            "stars": 4.7
+          },
+          "author": "John Smith",
+          "genre": "Historical Mystery",
+          "description": "A gripping tale set in Victorian England, unraveling a century-old mystery.",
+          "previous_score": 1,
+          "title": "Whispers of the Past"
+        }
+      },
+      {
+        "_index": "book-index",
+        "_id": "3",
+        "_score": 4.5,
+        "_source": {
+          "reviews": {
+            "stars": 4.5
+          },
+          "author": "Emily Clark",
+          "genre": "Science Fiction",
+          "description": "In a future where dreams can be shared, one girl discovers her imaginations power.",
+          "previous_score": 1,
+          "title": "Starlit Dreams"
+        }
+      },
+      {
+        "_index": "book-index",
+        "_id": "1",
+        "_score": 4.2,
+        "_source": {
+          "reviews": {
+            "stars": 4.2
+          },
+          "author": "Jane Doe",
+          "genre": "Adventure Fiction",
+          "description": "An exhilarating journey through a hidden civilization in the Amazon rainforest.",
+          "previous_score": 1,
+          "title": "The Lost City"
+        }
+      }
+    ]
+  },
+  "profile": {
+    "shards": []
+  }
+}
+```
+
+## Next steps
+
+- Learn more about the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/).