-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Add documentation for new Analysis tab in logs app #49165
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
[role="xpack"] | ||
[[xpack-logs-analysis-page]] | ||
== Detecting and inspecting log anomalies | ||
|
||
beta::[] | ||
|
||
If the {ml} {anomaly-detect} features are enabled, you can use the *Analysis* page in the Logs app to automatically detect some kinds of log anomalies. | ||
The analysis automatically highlights periods where the log rate is outside the expected limits and therefore may be anomalous. | ||
This helps you to spot suspicious behavior without significant human intervention. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really like this bit. I would move it to the top. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally, this would stop users having to manually sample their log data, calculate the rates, and decide whether those rates are "normal". |
||
You can use this information as a basis for further investigations. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could be various things:
These are just examples, mileage will vary between datasets and anomalies. Also want to clarify that whilst the backing model which has been trained will have a lower and upper bound for what it considers "normal" and non-anomalous, it doesn't mean anomalous values will always land within these bounds. The model could have upper as 50 and lower as 10, and 30 could still, in the right circumstances, flag as anomalous if something else about the rate is still considered anomalous. |
||
|
||
On the *Analysis* page, you can inspect the anomalies and the log partitions in which they occurred. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like the overall idea of this UI. I would move this to the top. |
||
You can also view the anomalies directly in the Machine Learning app to get a greater understanding of the issues. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
How do they get to the Machine Learning app? Is there a link you could include the this page in the Machine Learning docs? |
||
|
||
[role="screenshot"] | ||
image::logs/images/analysis-tab.png[Analysis tab in Logs app in Kibana] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[float] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[[logs-analysis-page-create-ml-job]] | ||
=== Create a machine learning job for logs analysis | ||
Logs anomaly detection is carried out within a {kibana-ref}/xpack-spaces.html[space]. | ||
Within a space, the first time you select *Analysis* from the Logs app, you are prompted to create a machine learning job to carry out the logs analysis. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At this point, they already know what a space is. How about: To enable log analysis and anomaly detection, you must create your own {kibana-ref}/xpack-spaces.html[space]. |
||
|
||
First, you need to choose the time range for the analysis. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These sounds like tasks. How about creating a task for this content. For example:
From the screenshot, I don't see Create ML job. Is this on a different UI? |
||
By default, the analysis uses logs from between four weeks ago and the current date, then continues to add new logs to the analysis as they are ingested. You cannot change the time range for the analysis after the machine learning job has been created. | ||
|
||
Once you have selected the time range, click *Create ML job* to create the machine learning job. | ||
Now you can start detecting anomalies in your logs. | ||
|
||
[float] | ||
[[logs-analysis-page-view-log-entries]] | ||
=== View log entries | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Once the machine learning job has been created, the *Analysis* page shows: | ||
* the log entries chart | ||
* an overall anomalies chart | ||
* the anomalies in each partition. | ||
|
||
The time range over which the logs are analyzed is fixed at the time range you selected when you created the machine learning job. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since they have just created the machine learning job, is this piece necessary? Shouldn't they already know this? |
||
But you can use the time filter at the top of the *Analysis* page to restrict the time range for which the results are shown. | ||
|
||
[float] | ||
[[logs-analysis-page-change-time]] | ||
=== Changing the time range | ||
|
||
Use the time filter to select the time range for the results shown in the anomaly charts. | ||
|
||
To quickly select some popular time range options, click the clock dropdown image:logs/images/time-filter-clock.png[]. In this popup you can choose from: | ||
|
||
* *Quick select* to choose a recent time range, and use the back and forward arrows to move through the time ranges | ||
* *Commonly used* to choose a time range from some commonly used options such as *Last 15 minutes*, *Today*, or *Week to date* | ||
* *Refresh every* to specify an auto-refresh rate | ||
* *Stop* to stop auto-refresh (enabled by default for logs anomaly charts) | ||
|
||
NOTE: When you stop auto-refresh from within this dialog, the clock dropdown changes to a calendar image:logs/images/time-filter-calendar.png[]. | ||
|
||
For complete control over the start and end times, click the start time or end time shown in the bar beside the calendar or clock dropdown. In this popup, you can choose from the *Absolute*, *Relative* or *Now* tabs, then specify the required options. | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[float] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[[logs-analysis-page-log-entries-chart]] | ||
=== Log entries chart | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since |
||
|
||
[role="screenshot"] | ||
image::logs/images/analysis-tab-log-entries.png[Analysis tab log entries] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The log entries chart shows an overall visualization of the log entry rate, partitioned and color-coded according to the value of the {ecs-ref}/ecs-event.html[ECS `event.dataset`] field. | ||
|
||
You can hover over a time period to see the log rate for each of the partitions for that period. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
You can click a partition name on the right hand side to show or hide the values for that partition, or hover over a partition name to highlight just the values for that partition in the chart. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The content in |
||
|
||
[float] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[[logs-analysis-page-anomalies-chart]] | ||
=== Anomalies chart | ||
|
||
[role="screenshot"] | ||
image::logs/images/analysis-tab-anomalies.png[Analysis tab anomalies] | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The Anomalies chart shows the areas where anomalies were detected in the overall log entry rate across all log partitions. The underlying rate values are shown in grey, and the anomalous regions are color-coded and superimposed on top. | ||
|
||
Where a time period is flagged as anomalous, it means that the machine learning algorithms detected something unusual about the log rate in that time period. This may be because the log rate was significantly higher than usual, or significantly lower than usual, or some other anomalous behavior was detected. | ||
|
||
The level of anomaly detected in a time period is color-coded from red through orange to yellow and blue, where red indicates a critical anomaly level, and blue is a warning level. | ||
|
||
You can hover over an underlying log rate value to see the average log rate for that time period, or hover over an anomalous region to see the partitions that had anomalies in that time period, and their anomaly scores. Anomaly scores range from 0 (no anomalies) to 100 (critical). | ||
|
||
You can also click *Analyze in ML* to open the Anomaly Explorer in Machine Learning and {kibana-ref}/xpack-ml.html[analyze the anomalies in more detail]. | ||
|
||
[float] | ||
[[logs-analysis-tab-partition-anomaly-chart]] | ||
=== Partition anomaly charts | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[role="screenshot"] | ||
image::logs/images/analysis-tab-partition-anomalies.png[Analysis tab partition anomalies] | ||
|
||
You can also view the anomaly chart for an individual partition. | ||
Below the main anomalies chart, click the dropdown beside a partition name to see the anomaly distribution for only that partition. | ||
In this example, we are viewing the anomaly chart for the `elasticsearch.server` partition. | ||
|
||
You can hover over an underlying log rate value to see the average log rate for that partition in that time period, or hover over an anomalous region to see the anomaly score for that partition in that time period. | ||
|
||
You can also click *Analyze in ML* to open the Anomaly Explorer in Machine Learning and {kibana-ref}/xpack-ml.html[analyze the anomalies in this partition in more detail]. | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,6 +78,16 @@ This opens the *Log event document details* fly-out that shows the fields associ | |
To quickly filter the logs stream by one of the field values, in the log event details, click the *View event with filter* icon image:logs/images/logs-view-event-with-filter.png[View event icon] beside the field. | ||
This automatically adds a search filter to the logs stream to filter the entries by this field and value. | ||
|
||
[float] | ||
[[view-log-anomalies]] | ||
=== View log anomalies | ||
bmorelli25 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If the {ml} {anomaly-detect} features are enabled, you can click *Analysis* to <<xpack-logs-analysis-page, use machine learning to detect and inspect anomalies>> in your log data. | ||
|
||
[float] | ||
[[using-logs-other-actions]] | ||
=== Other actions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there something more descriptive we can use here besides |
||
|
||
To see other actions related to the event, in the log event details, click *Actions*. | ||
Depending on the event and the features you have installed and configured, you may also be able to: | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
periods of what? Time?
What makes them expected limits? Are they specified somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is periods of time.
They are expected limits based on the model defined by the machine learning module, and the "learning" it has done on the datasets to date. Therefore these values will always differ based on the individual dataset. A rate of 10 might be anomalous in one dataset, but not anomalous in another. The ML model will adapt itself over time as it learns from more data.
It may be better to use the word "bounds" here over "limit" as that's the ML terminology.