Description
Elasticsearch version (bin/elasticsearch --version
): 7.16.2 (running inside Elastic Cloud)
Plugins installed: []
JVM version (java -version
): n/a (Elastic Cloud)
OS version (uname -a
if on a Unix-like system): n/a (Elastic Cloud)
Description of the problem including expected versus actual behavior:
I seem to be running into an issue where Field Level Security throws a null exception when operating on frozen indices.
I have a simple ILM policy for my index that moves data from Hot to Frozen after 12 hours. Within that data set, I would like to grant access to all fields except for a few specific ones that I would like to remain internal only.
If I create a new user and grant them a custom role with field level security (allowing and denying specific fields), that user cannot search for anything beyond my hot data tier without getting the following exception back
"reason": "unsupported_operation_exception: null"
Within the data access role, If I disable Grant access to specific fields
, the user can see and return results from the frozen tier.
I will note that in my current environment, this role also is using a Grant read privileges to specific documents
templated query, however that does not seem to have an impact on this issue. I have tried to produce a working example below that does not involve that privilege.
Steps to reproduce:
-
Create a simple ILM policy that rolls data out of a hot index and into a frozen index
-
Index data into your ILM managed index so that you have both hot data AND frozen data within your cluster. If my ILM index alias was called
pulse
, my underlying indices arepulse-0001
,pulse-0002
, etc and the frozen indices look likepartial-pulse-0001
,partial-pulse-0002
... etc -
Create a new role that grants read access to you your desired indices, like below (I am using Kibana):
-
Create a new user, and assign them typical access to a kibana space and grant them the data role from step 3
-
In a new private browser, log in as your new user and validate they have access to your frozen tier data and hot tier data, by viewing the Discover panel and looking at a timerange that spans hot and frozen tiers. (24 hrs in my case, see below as an example)
- Go back to the role you created as an admin, and check the box
Grant access to specific fields
. Deny a field in your data (see below as an example)
- Back as your new user, refresh the page to see shard exceptions being thrown for all your frozen indices (even though my time range is still set to 24 hours, I get exceptions for my entire frozen tier)
Note in the screenshot above that my data is cut off arbitrarily, right near my frozen tier rollover line from my ILM policy
- Investigate the exception further and you get the following
- Clicking the tab for "Request" shows very normal request, and the "Response" tab looks like below:
- From the command line, I can search the cluster easily if I use a simple count search on a hot tier index
curl https://user:pass@my-cluster.es.us-east-1.aws.found.io:9243/pulse-000252/_count
# returns
{"count":<real number here>,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}
But if I try to do an operation on the whole alias that includes frozen shards, I get shard exceptions.
curl https://user:pass@my-cluster.es.us-east-1.aws.found.io:9243/pulse/_count
# returns
{"count":<partial number here>,"_shards":{"total":248,"successful":14,"skipped":0,"failed":234,"failures":[{"shard":0,"index":"partial-pulse-000015","node":"XCRMYhdLR3KHuHxm74vlCg","reason":{"type":"unsupported_operation_exception","reason":"unsupported_operation_exception: null"}},{"shard":0,"index":"partial-pulse-000016","node":"9SNaA5L9TCqZ8l0BA39c1Q","reason":{"type":"unsupported_operation_exception","reason":"unsupported_operation_exception: null"}},{"shard":0,"index":"partial-pulse-000017","node":"XCRMYhdLR3KHuHxm74vlCg","reason":{"type":"unsupported_operation_exception","reason":"unsupported_operation_exception: null"}},.....
- For sanity you can go back to your role configuration and uncheck "Grant access to specific fields" and run that _count command again:
curl https://user:pass@my-cluster.es.us-east-1.aws.found.io:9243/pulse/_count
{"count":<real number here>,"_shards":{"total":248,"successful":248,"skipped":0,"failed":0}}
and it works.
I have also tried combing through the built in roles for Elastic, as well as the built in index priviledges to see if there was anything related to the frozen tier specifically that causes this behavior, without much luck.
Provide logs (if relevant):
I have tried to comb the logs inside of Elastic Cloud but the UI does not seem to be surfacing this exception where I can find it.