You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/stack/ml/df-analytics/flightdata-regression.asciidoc
+90-45Lines changed: 90 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,9 +106,13 @@ image::images/flights-regression-job-1.png["Creating a {dfanalytics-job} in {kib
106
106
[role="screenshot"]
107
107
image::images/flights-regression-job-2.png["Creating a {dfanalytics-job} in {kib}" – continued]
108
108
109
+
[role="screenshot"]
110
+
image::images/flights-regression-job-3.png["Creating a {dfanalytics-job} in {kib}" – advanced options]
111
+
109
112
110
113
.. Choose `kibana_sample_data_flights` as the source index.
111
114
.. Choose `regression` as the job type.
115
+
.. Optionally improve the quality of the analysis by adding a query that removes erroneous data. In this case, we omit flights with a distance of 0 kilometers or less.
112
116
.. Choose `FlightDelayMin` as the dependent variable, which is the field that we
113
117
want to predict with the {reganalysis}.
114
118
.. Add `Cancelled`, `FlightDelay`, and `FlightDelayType` to the list of excluded
@@ -117,16 +121,18 @@ exclude fields that either contain erroneous data or describe the
117
121
`dependent_variable`.
118
122
.. Choose a training percent of `90` which means it randomly selects 90% of the
119
123
source data for training.
120
-
.. Use the default feature importance values.
124
+
.. If you want to experiment with <<ml-feature-importance,feature importance>>,
125
+
specify a value in the advanced configuration options. In this example, we
126
+
choose to return a maximum of 5 feature importance values per document. This
127
+
option affects the speed of the analysis, so by default it is disabled.
121
128
.. Use the default memory limit for the job. If the job requires more than this
122
129
amount of memory, it fails to start. If the available memory on the node is
123
130
limited, this setting makes it possible to prevent job execution.
124
131
.. Add a job ID and optionally a job description.
125
132
.. Add the name of the destination index that will contain the results of the
126
-
analysis. It will contain a copy of the source index data where each document is
127
-
annotated with the results. If the index does not exist, it will be created
128
-
automatically.
129
-
133
+
analysis. In {kib}, the index name matches the job ID by default. It will
134
+
contain a copy of the source index data where each document is annotated with
135
+
the results. If the index does not exist, it will be created automatically.
130
136
131
137
.API example
132
138
[%collapsible]
@@ -139,7 +145,7 @@ PUT _ml/data_frame/analytics/model-flight-delays
139
145
"index": [
140
146
"kibana_sample_data_flights"
141
147
],
142
-
"query": { <1>
148
+
"query": {
143
149
"range": {
144
150
"DistanceKilometers": {
145
151
"gt": 0
@@ -148,7 +154,7 @@ PUT _ml/data_frame/analytics/model-flight-delays
148
154
}
149
155
},
150
156
"dest": {
151
-
"index": "df-flight-delays"
157
+
"index": "model-flight-delays"
152
158
},
153
159
"analysis": {
154
160
"regression": {
@@ -167,9 +173,6 @@ PUT _ml/data_frame/analytics/model-flight-delays
0 commit comments