Skip to content

[ML] add auditor to data frame plugin #40012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 23, 2019

Conversation

benwtrent
Copy link
Member

This adds a DataFrameAuditor, a index for audit messages, and an audit messages class.

I took the basic functionality found in the ML Auditor, and created two new abstract classes in
org.elasticsearch.xpack.core.common.notifications for consolidating what would be commonalities between any plugin using an Auditor-like system.

I opted to simply copy the Level class found in the ml notifications package. I thought it best do any refactoring for ML in a different PR.

As for what to audit and when, that question is still up in the air, but at least after this PR that functionality will be available.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments

I think we should discuss this offline.

@@ -27,11 +30,16 @@
public static final String INDEX_TEMPLATE_PATTERN = ".data-frame-internal-";
public static final String INDEX_TEMPLATE_NAME = INDEX_TEMPLATE_PATTERN + INDEX_TEMPLATE_VERSION;
public static final String INDEX_NAME = INDEX_TEMPLATE_NAME;
public static final String AUDIT_INDEX = ".data-frame-notifications";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although the chance is low, I would version it

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Are we going to implement this for anomaly detection?

import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin;

public abstract class AbstractAuditor<T extends AbstractAuditMessage> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this needs to be abstact. We could pass executionOrigin and auditIndex in the constructor. And we could simply assume all implementation would simply log-trace success/failure, or at least those 2 could have a default implementation which most implementors would not need override.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimitris-athanasiou I waffled back and forth while writing this on it being abstract or not. Comes down to inheritance vs injection. I am fine with simply passing the appropriate strings if that is preferred. Especially since nothing really interacts with the AbstractAuditor type by-itself.


protected abstract String getResourceField();

public static final class AuditMessageBuilder<T extends AbstractAuditMessage> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to Builder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, since it is already nested in that class.

import java.util.Locale;

public enum Level implements Writeable {
INFO, ACTIVITY, WARNING, ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would we use activity instead of info?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that we don't use this at all in the current product. In the old product where we did use it it is interesting to see who added it:

f52b8ead288 (Dimitrios Athanasiou 2016-05-05 13:36:31 +0100 101)                 auditor.activity(buildUsageMessage(activeJobs));

In other words, the usage information similar to what the X-Pack usage endpoint returns and is now stored by monitoring used to be stored in the notifications index with type activity.

But assuming I'm correct that we don't use it any more (please double check @benwtrent) now would be a good time to remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing about removing activity - usually this would be a BWC breaking change. But in this case I don't think we ever serialise Level or AuditMessage on the wire, so we can do it. This leads to the question, should we even support serialising Level or AuditMessage on the wire? If you remove wire serialisation from Level and AuditMessage and their tests does the code still compile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195 @dimitris-athanasiou I tested on ML by removing serialization and ACTIVITY and there are no calls to serialization for AuditMessage and therefore no calls to Level serialization.

}

AbstractAuditMessage(String resourceId, String message, Level level, String nodeName) {
this.resourceId = resourceId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Objects.requireNotNull to those that need it. Probably all?

import java.util.Locale;

public enum Level implements Writeable {
INFO, ACTIVITY, WARNING, ERROR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing about removing activity - usually this would be a BWC breaking change. But in this case I don't think we ever serialise Level or AuditMessage on the wire, so we can do it. This leads to the question, should we even support serialising Level or AuditMessage on the wire? If you remove wire serialisation from Level and AuditMessage and their tests does the code still compile?

@dimitris-athanasiou
Copy link
Contributor

@droberts195 I think we do need to serialize it. If say, a master node action issues the request to index a notification, it will have to travel to a data node to be written in the index.

@droberts195
Copy link
Contributor

I think we do need to serialize it. If say, a master node action issues the request to index a notification, it will have to travel to a data node to be written in the index

Yes, but in that scenario it's the source of the index request created here that gets serialised. And that just contains the JSON doc to be indexed. Is there a place (apart from unit tests) where we serialise the higher level AuditMessage or Level objects?

@benwtrent benwtrent added :ml/Transform Transform and removed :ml Machine learning labels Mar 14, 2019
@benwtrent benwtrent changed the title [Data Frame] add auditor to data frame plugin [ML] add auditor to data frame plugin Mar 14, 2019
@benwtrent
Copy link
Member Author

@jasontedor ping This PR adds some commonly useful code to xpack.common

@benwtrent
Copy link
Member Author

run elasticsearch-ci/bwc

new RoleDescriptor.IndicesPrivileges[]{
RoleDescriptor.IndicesPrivileges.builder()
.indices(".data-frame-notifications*")
.privileges("view_index_metadata", "read").build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open a separate PR on the stack docs similar to elastic/stack-docs#246 that updates the docs for the definition of these two roles. It causes problems for people who create their own custom roles if we don't accurately state which privileges our built in roles provide.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195, I definitely will if this PR gets approved and we are going to use the Auditor.

@benwtrent
Copy link
Member Author

run elasticsearch-ci/2

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent
Copy link
Member Author

run elasticsearch-ci/2

@benwtrent
Copy link
Member Author

run elasticsearch-ci/2

@benwtrent benwtrent merged commit 3b6d42d into elastic:master Mar 23, 2019
@benwtrent benwtrent deleted the feature/data-frame-add-auditor branch March 23, 2019 19:08
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Mar 23, 2019
* [Data Frame] add auditor

* Adjusting Level, Auditor, and message to address pr comments

* Addressing PR comments
benwtrent added a commit that referenced this pull request Mar 23, 2019
* [Data Frame] add auditor

* Adjusting Level, Auditor, and message to address pr comments

* Addressing PR comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants