Skip to content

Commit c976e68

Browse files
committed
[DOCS] Add concept of trained models
1 parent be228b5 commit c976e68

File tree

5 files changed

+52
-30
lines changed

5 files changed

+52
-30
lines changed
144 KB
Loading

docs/en/stack/ml/df-analytics/index.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ include::ml-dfanalytics-evaluate.asciidoc[leveloffset=+2]
1313
include::ml-feature-encoding.asciidoc[leveloffset=+2]
1414
include::ml-feature-importance.asciidoc[leveloffset=+2]
1515
include::hyperparameters.asciidoc[leveloffset=+2]
16+
include::ml-trained-models.asciidoc[leveloffset=+2]
1617

1718
include::ml-dfanalytics-apis.asciidoc[leveloffset=+1]
1819

docs/en/stack/ml/df-analytics/ml-inference.asciidoc

Lines changed: 14 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ experimental::[]
77
{infer-cap} is a {ml} feature that enables you to use supervised {ml} processes
88
– like <<dfa-regression>> or <<dfa-classification>> – not only as a batch
99
analysis but in a continuous fashion. This means that {infer} makes it possible
10-
to use trained {ml} models against incoming data.
10+
to use <<ml-trained-models,trained {ml} models>> against incoming data.
1111

1212
For instance, suppose you have an online service and you would like to predict
1313
whether a customer is likely to churn. You have an index with historical data –
@@ -19,32 +19,16 @@ trained the model on, and get a prediction.
1919

2020
Let's take a closer look at the machinery behind {infer}.
2121

22-
23-
[[ml-inference-models]]
24-
== Trained {ml} models as functions
25-
26-
When you create a {dfanalytics-job} that executes a supervised process, you need
27-
to train a {ml} model on a training dataset to be able to make predictions on
28-
data points that the model has never seen. The models that are created by
29-
{dfanalytics} are stored as {es} documents in internal indices. In other words,
30-
the characteristics of your trained models are saved and ready to be used as
31-
functions.
32-
33-
Alternatively, you can use a pre-trained language identification model to
34-
determine the language of text. {lang-ident-cap} supports 109 languages. For
35-
more information and configuration details, check the <<ml-lang-ident>> page.
36-
37-
3822
[[ml-inference-processor]]
3923
== {infer-cap} processor
4024

4125
{infer-cap} can be used as a processor specified in an
42-
{ref}/pipeline.html[ingest pipeline]. It uses a stored {dfanalytics} model to
43-
infer against the data that is being ingested in the pipeline. The model is used
44-
on the {ref}/ingest.html[ingest node]. {infer-cap} pre-processes the data by
45-
using the model and provides a prediction. After the process, the pipeline
46-
continues executing (if there is any other processor in the pipeline), finally
47-
the new data together with the results are indexed into the destination index.
26+
{ref}/pipeline.html[ingest pipeline]. It uses a trained model to infer against
27+
the data that is being ingested in the pipeline. The model is used on the
28+
{ref}/ingest.html[ingest node]. {infer-cap} pre-processes the data by using the
29+
model and provides a prediction. After the process, the pipeline continues
30+
executing (if there is any other processor in the pipeline), finally the new
31+
data together with the results are indexed into the destination index.
4832

4933
Check the {ref}/inference-processor.html[{infer} processor] and
5034
{ref}/ml-df-analytics-apis.html[the {ml} {dfanalytics} API documentation] to
@@ -55,14 +39,14 @@ learn more about the feature.
5539
== {infer-cap} aggregation
5640

5741
{infer-cap} can also be used as a pipeline aggregation. You can reference a
58-
pre-trained {dfanalytics} model in the aggregation to infer on the result field
59-
of the parent bucket aggregation. The {infer} aggregation uses the model on the
60-
results to provide a prediction. This aggregation enables you to run
61-
{classification} or {reganalysis} at search time. If you want to perform the
62-
analysis on a small set of data, this aggregation enables you to generate
63-
predictions without the need to set up a processor in the ingest pipeline.
42+
trained model in the aggregation to infer on the result field of the parent
43+
bucket aggregation. The {infer} aggregation uses the model on the results to
44+
provide a prediction. This aggregation enables you to run {classification} or
45+
{reganalysis} at search time. If you want to perform the analysis on a small set
46+
of data, this aggregation enables you to generate predictions without the need
47+
to set up a processor in the ingest pipeline.
6448

6549
Check the
6650
{ref}/search-aggregations-pipeline-inference-bucket-aggregation.html[{infer} bucket aggregation]
6751
and {ref}/ml-df-analytics-apis.html[the {ml} {dfanalytics} API documentation] to
68-
learn more about the feature.
52+
learn more about the feature.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
[role="xpack"]
2+
[[ml-trained-models]]
3+
= Trained models
4+
5+
experimental::[]
6+
7+
When you use a {dfanalytics-job} to perform {classification} or {reganalysis},
8+
it creates a {ml} model that is trained and tested against a labelled data set.
9+
In particular, the data set contains the correct values for a field (known as a
10+
_dependent variable_) that you want to ultimately predict based on its
11+
relationships to other fields (known as _feature variables_) in your data. When
12+
you are satisfied with your trained model, you can use it to make predictions
13+
against new data. For example, you can use it in the preprocessor of an ingest
14+
pipeline or in a pipeline aggregation within a search query. For more
15+
information about this process, see <<ml-supervised-workflow>> and
16+
<<ml-inference>>.
17+
18+
You can also supply trained models that are not created by {dfanalytics-job} but
19+
adhere to the appropriate https://github.com/elastic/ml-json-schemas[JSON schema].
20+
If you want to use these trained models in the {stack}, you must store them in
21+
{es} documents in internal indices by using the
22+
{ref}/put-inference.html[create trained model API].
23+
24+
In {kib}, you can view and manage your trained models within
25+
*{ml-app}* > *Data Frame Analytics*:
26+
27+
[role="screenshot"]
28+
image::images/trained-model-management.png["List of trained models in the {ml-app} app in {kib}"]
29+
30+
Alternatively, you can use the appropriate
31+
{ref}/ml-df-analytics-apis.html[{ml} APIs].
32+

docs/en/stack/ml/redirects.asciidoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,8 @@ This page has moved. See <<ml-datafeeds>>.
1212
=== Performing population analysis
1313

1414
This page has moved. See <<ml-configuring-populations>>.
15+
16+
[role="exclude",id="ml-inference-models"]
17+
=== Trained {ml} models as functions
18+
19+
This content has moved. See <<ml-trained-models>>.

0 commit comments

Comments
 (0)