-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add request parameter to block index auto creation #34737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add request parameter to block index auto creation #34737
Conversation
Pinging @elastic/es-core-infra |
Note: the name of parameter ( |
Relates to #34649: for the new custom exception message I include the index name into exception message. |
…create_index_param
ffc6f0b
to
ef7c00f
Compare
@vladimirdolzhenko Would you explain the motivation for proposing this change? I think that we have seen enhancements requests for this functionality before and turned them down (I will search for it later). |
@jasontedor It has been asked by @bleskes |
@vladimirdolzhenko Okay, can you articulate what the need is? |
Current shippers like beats default to bootstrap their connections by adding an index templates and relying on automatic index creation to create indices based on those templates. This was introduced in order to address problems where people forgot to run the setup command and started indexing, only to later find out that their data is no good. This default behavior works well and have solved the problem for which it is created, but there still two issues on that front:
The ability to make sure that a request never creates an index (i.e., I only want this data to be indexed, if possible) will help beats address these issues when people use ILM (where an index needs to be explicitly created with a write alias). Next to the above, such a flag may simplify using index templates in use cases where a single index is used for storage (like kibana and security). We current add a template for the purpose of protecting against people deleting an existing index and an inflight indexing request from creating a new one without a mapping. This flag would allow to fail the request and fall back to explicit index creation. Last, and this is still very much in the air, there is chance the configuring the templates/data environment will move away from the edge shippers to some other central point, which strengthens the problems mentioned above. This too far out to effect current decisions but just put this on the horizon. |
/cc @tsg |
We discussed it briefly in #23685. I have since come to the conclusion that this is the only clean way to avoid the race condition around putting the index template and indexing the first document. It also has the beneficial side effect of reducing the need for templates (in favour of direct index creation). |
Where do stand on this one? @vladimirdolzhenko @bleskes @jasontedor |
This change would also benefit a cleanup in security. Today we try to be smart and make sure the index exists with the correct mappings before issuing a write but this is not foolproof so we also have to handle some error cases such as an IndexNotFoundException. Ideally we would like to assume everything is fine, issue the write, and then handle failures. If it is an INFE then we could create the index with the correct mappings. |
@bleskes could you please have a look into code if we can discuss and choose the name of parameter later ? |
Thank you for articulating the use-case @bleskes. I am leaning towards this but I have a clarifying question.
I think one thing missing for me in this case is how does the request flag to disable auto-index creation get set? It seems to me that it would require a new parameter on the Elasticsearch output plugin in Logstash that requires the user to manually set this?
This case is clear. If Beats auto-setup is disabled ( |
Logstash has an internal mechanism (called |
We've discussed in another channel and agreed to the name of parameter: |
One can disable all bootstrapping in beats and users do so in favour of running The flag is valid for all events in a bulk request, but we send Actually when sending Beats events directly or indirectly via Logstash, it would make sense to always enable this flag. But again, this might surprise users mixing Beats + other input types. |
@urso thanks for this insight and I fully agree. This can be solved both on the ES side (making this a per index request item parameter - which is something we really want to avoid to reduce the scope of the issue) and on the Logstash side (by making sure request never mix values of the relevant meta fields). Are you the right person to see what it would mean on the logstash output side? |
In logstash, mixing data from multiple sources in the same bulk request is a typical optimization we suggest to users: if possible use a single ES output and rely as much as possible on metadata to place events into different indices, figure out the document_id, document type (in pre ES 6.x era).
What we advise users when you can't mix the events in the same bulk request is then to use multiple ES outputs, but here we want the multiplexing to be done seamlessly. |
…create_index_param
@jsvd Thanks for explaining. Understood. I think that's quite a change on the Logstash side so we explored options on the ES side. The suggestion it to add an |
Thanks @bleskes that's great. Compression for requests is available on the ES output plugin but disabled by default. To support this we can add a new configuration option called
So the |
@jsvd |
we've discussed with @jsvd in another channel for the case when bulk request have requests for the same index but with different ES does check and create missing indices before doing any operations from bulk request and bulk requests are executing concurrently: it is reasonably to eliminate that kind of inaccuracy. updated: It falls into default behaviour if there is at least one request with no value for |
…dex` per in action meta data of bulk request
…create_index_param
…create_index_param
Pinging @elastic/es-core-features (:Core/Features/Indices APIs) |
Closing this PR, because we think that there is better approach to achieve the same behaviour and because the PR is stale (we should have dealt with this earlier, but it fell though many cracks). We discussed in a different context about overriding the |
Node setting
action.auto_create_index
allows to create index implicitly on bulk/reindex/update requests. Some cases require that index to be created in advance while it is still handy to have it turned on for the rest cases.For that purposes it is worth to have an option to disable automatic index creation on a request level.
This PR adds request parameter
auto_create_index
to index/update/delete requests, to disable automatic index creation it has to be set tofalse
if it is enabled byaction.auto_create_index
.Field
auto_create_index
is added to action and meta data of bulk request.Note: parameter
auto_create_index
can never be set to true