Optimized Privilege Evaluation

# Background

Performance tests indicate that the OpenSearch security layer adds a noticeable overhead to the indexing throughput of an OpenSearch cluster (cf https://github.com/opensearch-project/OpenSearch/issues/2461, https://github.com/opensearch-project/security/issues/3104). The overhead may vary depending on the number of indices, the use of aliases, the number of roles and the size of the user object.

As the latency is dependent on the number of indices, fresh clusters with only few indices perform well; however, users may experience with ongoing time and thus an increasing number of indices a continuous performance degradation. Especially such creeping changes are very undesirable, as they are hard to analyze and debug.

# Analysis

## High computational complexity of privilege evaluation

The privileges evaluation code uses a number of nested loops for determining if a user has privileges for an index (see for example https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/securityconf/ConfigModelV7.java#L1103 and https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/securityconf/ConfigModelV7.java#L488 ).

In human language, the privilege resolution algorithm is similar to this:

- Loop through all roles
  - Loop through all index patterns of role
    - Loop through all indices on cluster to find aliases and data streams
    - Loop through all aliases and data streams matching index pattern
      - Loop through all indices to find indices which are member of the alias
    - Loop through all indices to find indices which match the index pattern
    - Loop through indices requested by the operation
      - Loop through the indices produced by resolving the index patterns
        
Especially the last loop works on a cross product on two subsets of all existing indices. This is a O(n²) complexity class on the number of indices. On a cluster with many indices and in case very broad index patterns are used, this can require hundreds of thousands iterations.

A number of additional things can be observed here:

- The algorithm follows exactly the structure of the configuration files. This is not necessarily the most efficient approach.
- The algorithm newly resolves for each request the index patterns inside the role definition. This is not really necessary, as the result of these operations will remain constant until the configuration is changed or indices are created or deleted.

## Blow up of thread context data for DLS/FLS privilege evaluation

The privilege evaluation code for DLS/FLS works independently from the normal privilege evaluation code.

The resolution algorithm looks similar to this (compare https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/securityconf/ConfigModelV7.java#L355 and https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/DlsFlsValveImpl.java#L151):

- Loop through all roles
  - Loop through all index patterns of role
    - Loop through all indices on cluster and match index name against index pattern

This yields a resolved/expanded view of the DLS/FLS config which maps concrete index names to rules from the roles

It is important to note that this is view is not restricted to the indices requested by the operation. It covers all indices on the cluster. 

This data structure is then put into the thread context and sent in serialized form with each further sub-request of the operation: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/DlsFlsValveImpl.java#L445 On clusters with many indices, this will be a big data structure which will impact performance both due to increased serialization overhead and increased network data usage.

The actually desired indices are then selected from these data structures on the various shards targeted by the particular operation: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/DlsFlsValveImpl.java#L353

# Proposals

## Optimized data structures 

In order to achieve a more efficient privilege evaluation, we should have a closer look at the questions we actually have to answer during privilege evaluation. We can then model data structures which efficiently allow answering these questions.

We have the following effective parameters during privilege evaluation:

- Roles of a user: `Set<String>`; can be a very varying relatively low number. There are edge cases where users have hundreds of roles, though.
- Required action privileges: `Set<String>`; in most cases only containing 1 element - the action actually being executed. In rare cases additional required privs are added by special code.
- Requested indices: `Set<String>`; can be a very varying number between 1 and hundreds


An optimized data structure could be a chain of maps looking like this (pseudo-code):

```
Map<Action, Map<ConcreteIndex, Set<RoleName>>>
```

This maps action x index to the names of the roles which provide privileges for the particular combination of action and index.

This would be pre-computed based on the role configuration and the cluster state. The index information from the cluster state is used to pre-resolve the index patterns from the roles to produce concrete index names. Thus, this data-structure must be re-computed whenever a index is deleted or created.

The privilege evaluation algorithm would the look like this:

- Loop through requested actions (usually 1, seldomly 2)
  - Loop through requested indices
    - Loop through roles of user
      - Check whether role name is contained in pre-computed set of roles providing the permission for action and index. For hash table based `Map` implementations, this should be O(1).
      

While loops are still necessary, these are clearly bounded by the requested objects and the roles a user has. 

### Special case for `*`

Additionally, it makes sense to have further a data structure which help to cover the special but often reoccurring case of operations working on the index pattern `*`: 

Given an action, which roles give privileges for the wildcard `*` index pattern: `Map<Action, Set<RoleName>>`

### Privilege evaluation on non-resolved index patterns

There are two cases, in which the advance resolving will not work:

- Dynamic index patterns (containing `${...}` variables)
- Operations on-non existing indices; most prominently: create index operations

For these, role index patterns must be still resolved on the fly during the privilege evaluation phase.

### Aliases

OpenSearch allows to define privileges on aliases. In that case, the member indices of the alias automatically inherit the privileges of the alias. 

The data structures defined above can already support this by resolving aliases to concrete index names. However, this can lead to a potentially high number of indices which need to be checked for each request. 

It would be more efficient to alleviate the need of resolving the alias to single index names by also keeping data structures which operate on an alias level:

```
Map<Action, Map<Alias, Set<RoleName>>>
```

### Data Streams

Data streams work similar to aliases, so similar considerations apply here. 


### Knowledge of available action names

In order to be able to also expand patterns on action names to concrete action names, a knowledge of the names of all available actions is necessary. 

## Distributed DLS/FLS privilege evaluation

As described above, the DLS/FLS implementation currently includes the whole resolved DLS/FLS configuration in the thread context. This has a significant performance impact.

While this approach reduces the amount of privilege evaluation work that has to be done on the final shards, we can achieve the same by also keeping pre-computed data structures for DLS/FLS. This way, it would be completely unnecessary to put the DLS/FLS configuration into the thread context. The code working on the particular shards would be able to directly query the pre-computed data structures without additional overhead.

In the case of DLS/FLS, the action name is not taken into account. Thus, the data structure do not have to consider actions.

A suitable data structure for DLS could look like this (pseudo-code):

```
Map<ConcreteIndex, Map<RoleName, DlsQuery>> 
```

As the code is executed on the shards, there is always only one index name to be taken into account when privilege evaluation takes place. Thus, an evaluation algorithm looks like this:

- Retrieve Map<RoleName, DlsQuery> by requested index name (O(1) for hash table implementations of Map) 
- Loop through roles of user
  - Aggregate DLS queries
  
Additionally, it is necessary to keep track of roles which do not define DLS queries and thus allow unrestricted document access:

```
Map<ConcreteIndex, Set<RoleName>>
```

### Special case for `*`

Like for the normal privilege evaluation, it makes sense to have dedicated data structures for the special case of the `*` index pattern.

### Privilege evaluation on non-resolved index patterns

The advance resolving of index patterns will not work for dynamic index patterns (containing `${...}` variables).

For these, role index patterns must be still resolved on the fly during the privilege evaluation phase.

In contrast to the normal privilege evaluation, operations on non-existing indices do not need to be considered as DLS/FLS only operates on already existing indices.

### Migration strategy

As the change of the implementation changes the thread context, which acts as an interface between cluster nodes, we have to take care for the case of a mixed cluster (i.e. a cluster with nodes on different OpenSearch versions). A strategy/logic is necessary to make this that nodes on different versions can safely interoperate.

# Performance tests

Of course, the result needs to be validated. At the moment, there exists a basic performance test suite for OpenSearch. This however focuses on core operations rather than on the different aspects of the security layer. Thus, further effort is needed to build tests that verify the security performance from the needed perspective. 

We want to build a dedicated performance test suite focused on security aspects on the performance. A RFC covering this topic exists at #3903 .


# Low-level configurability of privilege evaluation

Currently, there are a number of static configuration options that influence the behaviour of the privilege evaluation code:

- `plugins.security.dfm_empty_overrides_all`
- `dynamic.multi_rolespan_enabled`

Both can be re-implemented using the new data structures. However, supporting them of course will make the code more complicated. It might make sense to check whether it makes sense to discontinue some of these, as to a certain extent these seem to only guard legacy behaviour.

# Request for comments

Please feel very welcome to share your thoughts on this concept. Are there missing considerations? Do you have suggestions for further improvements? Do you have questions?

Side-note: A related software project already uses a similar approach: https://git.floragunn.com/search-guard/search-guard-suite-enterprise/-/blob/master/security/src/main/java/com/floragunn/searchguard/authz/RoleBasedActionAuthorization.java (Apache 2 licensed code)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimized Privilege Evaluation #3870

Background

Analysis

High computational complexity of privilege evaluation

Blow up of thread context data for DLS/FLS privilege evaluation

Proposals

Optimized data structures

Special case for `*`

Privilege evaluation on non-resolved index patterns

Aliases

Data Streams

Knowledge of available action names

Distributed DLS/FLS privilege evaluation

Special case for `*`

Privilege evaluation on non-resolved index patterns

Migration strategy

Performance tests

Low-level configurability of privilege evaluation

Request for comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimized Privilege Evaluation #3870

Description

Background

Analysis

High computational complexity of privilege evaluation

Blow up of thread context data for DLS/FLS privilege evaluation

Proposals

Optimized data structures

Special case for *

Privilege evaluation on non-resolved index patterns

Aliases

Data Streams

Knowledge of available action names

Distributed DLS/FLS privilege evaluation

Special case for *

Privilege evaluation on non-resolved index patterns

Migration strategy

Performance tests

Low-level configurability of privilege evaluation

Request for comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Special case for `*`

Special case for `*`