Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/en/stack/ml/df-analytics/dfa-classification.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ can optionally include or exclude fields from the analysis. For more information
about field selection, see the
{ref}/explain-dfanalytics.html[explain data frame analytics API].


[[dfa-classification-supervised]]
== Training the {classification} model

Expand Down Expand Up @@ -63,6 +64,7 @@ that is approximately balanced. That is to say, ideally your data set should
have a similar number of data points for each class.
////


[[dfa-classification-algorithm]]
=== {classification-cap} algorithms

Expand All @@ -76,6 +78,17 @@ is an iteration of the last one, hence it improves the decision made by the
previous tree.
//end::classification-algorithms[]


[[dfa-classification-deploy]]
=== Deploying the model

The model that you created is stored as {es} documents in internal indices. In
other words, the characteristics of your trained model are saved and ready to be
deployed and used as functions. The <<ml-inference,{infer}>> feature enables you
to use your model in a preprocessor of an ingest pipeline to make predictions
about your data.


[[dfa-classification-performance]]
== {classification-cap} performance

Expand All @@ -97,12 +110,14 @@ prepare your input data such that it has less classes. You can also remove the
fields that are not relevant from the analysis by specifying `excludes` patterns
in the `analyzed_fields` object when configuring the {dfanalytics-job}.


[[dfa-classification-interpret]]
== Interpreting {classification} results

The following sections help you understand and interpret the results of a
{classanalysis}.


[[dfa-classification-class-probability]]
=== `class_probability`

Expand All @@ -114,6 +129,7 @@ in your destination index. See the
{ml-docs}/flightdata-classification.html#flightdata-classification-results[Viewing {classification} results]
section in the {classification} example.


[[dfa-classification-class-score]]
=== `class_score`

Expand Down Expand Up @@ -141,13 +157,15 @@ recall for `class 1`. Instead of this behavior, the default scheme of the
actual `class 0` predicted `class 1` errors, or in other words, a slight
degradation of the overall accuracy.


[[dfa-classification-feature-importance]]
=== {feat-imp-cap}

{feat-imp-cap} provides further information about the results of an analysis and
helps to interpret the results in a more subtle way. If you want to learn more
about {feat-imp}, <<ml-feature-importance,click here>>.


[[dfa-classification-evaluation]]
== Measuring model performance

Expand Down
14 changes: 14 additions & 0 deletions docs/en/stack/ml/df-analytics/dfa-regression.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ which floor it is, and whether the apartment has a riverside view or not, and so
on. All of these factors can be considered _features_; they are measurable
properties or characteristics of the phenomenon we're studying.


[[dfa-regression-features]]
== {feature-vars-cap}

Expand All @@ -51,6 +52,7 @@ algorithm:
apartment either has a riverside view or doesn't have one.
Arrays are not supported.


[[dfa-regression-supervised]]
== Training the {regression} model

Expand All @@ -72,6 +74,7 @@ predictions are combined.
{regression-cap} works as a batch analysis. If new data comes into your index,
you must restart the {dfanalytics-job}.


[[dfa-regression-algorithm]]
=== {regression-cap} algorithms

Expand All @@ -81,6 +84,17 @@ called extreme gradient boost (XGboost) which combines decision trees with
gradient boosting methodologies.
//end::regression-algorithms[]


[[dfa-regression-deploy]]
=== Deploying the model

The model that you created is stored as {es} documents in internal indices. In
other words, the characteristics of your trained model are saved and ready to be
deployed and used as functions. The <<ml-inference,{infer}>> feature enables you
to use your model in a preprocessor of an ingest pipeline to make predictions
about your data.


[[dfa-regression-lossfunction]]
=== Loss functions for {regression} analyses

Expand Down