[DOCS] Add concept of trained models

lcawl · lcawl · commit c976e687504e · 2020-09-30T17:33:55.000-07:00
diff --git a/docs/en/stack/ml/df-analytics/images/trained-model-management.png b/docs/en/stack/ml/df-analytics/images/trained-model-management.png
diff --git a/docs/en/stack/ml/df-analytics/index.asciidoc b/docs/en/stack/ml/df-analytics/index.asciidoc
@@ -13,6 +13,7 @@ include::ml-dfanalytics-evaluate.asciidoc[leveloffset=+2]
 include::ml-feature-encoding.asciidoc[leveloffset=+2]
 include::ml-feature-importance.asciidoc[leveloffset=+2]
 include::hyperparameters.asciidoc[leveloffset=+2]
+include::ml-trained-models.asciidoc[leveloffset=+2]
 
 include::ml-dfanalytics-apis.asciidoc[leveloffset=+1]
 
diff --git a/docs/en/stack/ml/df-analytics/ml-inference.asciidoc b/docs/en/stack/ml/df-analytics/ml-inference.asciidoc
@@ -7,7 +7,7 @@ experimental::[]
 {infer-cap} is a {ml} feature that enables you to use supervised {ml} processes 
 – like <<dfa-regression>> or <<dfa-classification>> – not only as a batch 
 analysis but in a continuous fashion. This means that {infer} makes it possible 
-to use trained {ml} models against incoming data.
+to use <<ml-trained-models,trained {ml} models>> against incoming data.
 
 For instance, suppose you have an online service and you would like to predict 
 whether a customer is likely to churn. You have an index with historical data – 
@@ -19,32 +19,16 @@ trained the model on, and get a prediction.
 
 Let's take a closer look at the machinery behind {infer}.
 
-
-[[ml-inference-models]]
-== Trained {ml} models as functions
-
-When you create a {dfanalytics-job} that executes a supervised process, you need 
-to train a {ml} model on a training dataset to be able to make predictions on 
-data points that the model has never seen. The models that are created by 
-{dfanalytics} are stored as {es} documents in internal indices. In other words, 
-the characteristics of your trained models are saved and ready to be used as 
-functions.
-
-Alternatively, you can use a pre-trained language identification model to 
-determine the language of text. {lang-ident-cap} supports 109 languages. For 
-more information and configuration details, check the <<ml-lang-ident>> page.
-
-
 [[ml-inference-processor]]
 == {infer-cap} processor
 
 {infer-cap} can be used as a processor specified in an 
-{ref}/pipeline.html[ingest pipeline]. It uses a stored {dfanalytics} model to 
-infer against the data that is being ingested in the pipeline. The model is used 
-on the {ref}/ingest.html[ingest node]. {infer-cap} pre-processes the data by 
-using the model and provides a prediction. After the process, the pipeline 
-continues executing (if there is any other processor in the pipeline), finally 
-the new data together with the results are indexed into the destination index.
+{ref}/pipeline.html[ingest pipeline]. It uses a trained model to infer against
+the data that is being ingested in the pipeline. The model is used on the
+{ref}/ingest.html[ingest node]. {infer-cap} pre-processes the data by using the
+model and provides a prediction. After the process, the pipeline continues
+executing (if there is any other processor in the pipeline), finally the new
+data together with the results are indexed into the destination index.
 
 Check the {ref}/inference-processor.html[{infer} processor] and 
 {ref}/ml-df-analytics-apis.html[the {ml} {dfanalytics} API documentation] to 
@@ -55,14 +39,14 @@ learn more about the feature.
 == {infer-cap} aggregation
 
 {infer-cap} can also be used as a pipeline aggregation. You can reference a 
-pre-trained {dfanalytics} model in the aggregation to infer on the result field 
-of the parent bucket aggregation. The {infer} aggregation uses the model on the 
-results to provide a prediction. This aggregation enables you to run 
-{classification} or {reganalysis} at search time. If you want to perform the 
-analysis on a small set of data, this aggregation enables you to generate 
-predictions without the need to set up a processor in the ingest pipeline.
+trained model in the aggregation to infer on the result field of the parent
+bucket aggregation. The {infer} aggregation uses the model on the results to
+provide a prediction. This aggregation enables you to run {classification} or
+{reganalysis} at search time. If you want to perform the analysis on a small set
+of data, this aggregation enables you to generate predictions without the need
+to set up a processor in the ingest pipeline.
 
 Check the 
 {ref}/search-aggregations-pipeline-inference-bucket-aggregation.html[{infer} bucket aggregation] 
 and {ref}/ml-df-analytics-apis.html[the {ml} {dfanalytics} API documentation] to 
-learn more about the feature.
+learn more about the feature.
diff --git a/docs/en/stack/ml/df-analytics/ml-trained-models.asciidoc b/docs/en/stack/ml/df-analytics/ml-trained-models.asciidoc
@@ -0,0 +1,32 @@
+[role="xpack"]
+[[ml-trained-models]]
+= Trained models
+
+experimental::[]
+
+When you use a {dfanalytics-job} to perform {classification} or {reganalysis},
+it creates a {ml} model that is trained and tested against a labelled data set.
+In particular, the data set contains the correct values for a field (known as a
+_dependent variable_) that you want to ultimately predict based on its
+relationships to other fields (known as _feature variables_) in your data. When
+you are satisfied with your trained model, you can use it to make predictions
+against new data. For example, you can use it in the preprocessor of an ingest
+pipeline or in a pipeline aggregation within a search query. For more
+information about this process, see <<ml-supervised-workflow>> and
+<<ml-inference>>.
+
+You can also supply trained models that are not created by {dfanalytics-job} but
+adhere to the appropriate https://github.com/elastic/ml-json-schemas[JSON schema].
+If you want to use these trained models in the {stack}, you must store them in
+{es} documents in internal indices by using the
+{ref}/put-inference.html[create trained model API].
+
+In {kib}, you can view and manage your trained models within
+*{ml-app}* > *Data Frame Analytics*:
+
+[role="screenshot"]
+image::images/trained-model-management.png["List of trained models in the {ml-app} app in {kib}"]
+
+Alternatively, you can use the appropriate
+{ref}/ml-df-analytics-apis.html[{ml} APIs].
+
diff --git a/docs/en/stack/ml/redirects.asciidoc b/docs/en/stack/ml/redirects.asciidoc
@@ -12,3 +12,8 @@ This page has moved. See <<ml-datafeeds>>.
 === Performing population analysis
 
 This page has moved. See <<ml-configuring-populations>>.
+
+[role="exclude",id="ml-inference-models"]
+=== Trained {ml} models as functions
+
+This content has moved. See <<ml-trained-models>>.