Skip to content

[FEATURE] ML Model Access Controls (ml-commons plugin) #815

@dylan-tong-aws

Description

@dylan-tong-aws

What are you proposing? In a few sentences, describe the feature and its core capabilities.

We are building model access controls for the ml-commons plugin, so that our users can govern who can perform actions on each ML model that is managed by our plugin.

Which users have asked for this feature? (research, proposals, requests or anecdotes. Include links to GitHub Issues, Forums, Stack Overflow, Twitter, Etc.)

We’ve proactively decided to add this feature because we believe security is a job zero priority.

What is the developer experience going to be? Does this have a REST API? If so, please describe the API and any impact it may have to existing APIs. In a brief summary (not a spec), highlight what new REST APIs or changes to REST APIs are planned. as well as any other API, CLI or Configuration changes that are planned as part of this feature.

The planned solution will resemble the same design used by the alerting and anomaly detection plugins to support resource-level access controls. This design revolves around attaching backend roles to users and model groups, which are collections of versions of a particular model. The design is described in further details in the description about the user experience. In short, users will be able to access the models that share a common backend role.

Here is the documentation for the anomaly detection plugin, which provides examples of how the security plugin APIs can be used to configure backend roles for detectors. Our plan is to support the same design by allowing users to create backend roles in the same way by using the same APIs. We will add APIs within the ml-commons plugins to create, modify and delete model groups, which will include the ability to associate backend roles to these model groups.

Are there any security considerations? What is the security model of the new APIs? Features should be integrated into the OpenSearch security suite and so if they are not, we should highlight the reasons here.

This feature enhances security.

Are there any breaking changes to the API? If Yes, what is the path to minimizing impact? (example, add new API and deprecate the old one)

No

What is the user experience going to be?

The user experience will be the similar to what we provide in other plugins like anomaly detection and alerting. For example, the anomaly detection plugin’s process is configure detector-level access controls is described here.

The high-level flow to limit a user’s ability to access specific models and define what actions they can perform on those model's as an admin is as follows:

  1. [Already Supported] Configure a role that limits the permissions the user has to perform specific actions on models. This involves configuring permissions to invoke the specific model management APIs within the ml-commons plugin below. For instance, one might create a model builder role that allows a user to get model information, search and invoke a model or a MLOps role that provides permissions to invoke all these APIs.

  2. [Already Supported] The admin can then create a backend role and associate roles to these backend roles. A natural way to configure the backend role is to represent an organizational group. For instance, you might create a backend role to represent a data science team within your organization and associate the model builder role to that backend role.

  3. [New Functionality] If a model group doesn’t already exist to manage a collection model versions, a user who has the required permissions can create a model group. One or more backend roles can be associated to a model group. By default, the user that creates the model group is designated as the owner and this user’s backend roles will be attached to the model by default. The user has the option to specify a subset of backend roles. This provides the model owner access to the models that they create or import into an OpenSearch cluster. An admin type user will have the ability to reassign the model owner and change the access controls as needed.

Once the above steps are setup, a user will be limited to model API operations on the models with a common backend role. Specifically, the expected logic used to govern a user’s access is as follows:

  1. A user has access to all the model versions within a model group if they share one more or more backend roles with the model group.

  2. A user is authorized to perform actions on a model group based on the aggregate permissions. For instance, if a user shares two backend roles with a model group where one provides search permissions and the other provides deployment permissions, the user will be able to perform both search and deployment actions on the models.

  3. If the user is directly associated with a role that provides permissions to perform ml-common model API operations, they will also be authorized to perform those actions. For instance, if a user has the permissions to delete models through a role that is directly associated with a user, they will be able to perform delete operations on any model group that shares a common backend role with the user. In other words, the user is authorized to perform the aggregate permissions provided by all the roles associated with common model groups and through the roles they’re directly associated with.

In summary, backend roles are used to filter what model groups a user could perform actions on. What actions a user can perform on these model groups are the aggregate permissions from all the associated roles. This logic is consistent with the design used by other plugins. However, there are limitations which have been outlined in the open questions section.

Are there breaking changes to the User Experience? Will this change the existing user experience? Will this be a breaking change from a user flow or user experience perspective?

No

What will it take to execute? Are there any assumptions you may be making that could limit scope or add limitations? Are there performance, cost, or technical constraints that may impact the user experience? Does this feature depend on other feature work? What additional risks are there?

Additional security features come at the cost. Additional access control checks will be required and the overhead can increase API execution time.

Any remaining open questions? What are known enhancements to this feature? Any enhancements that may be out of scope but that we will want to track long term? List any other open questions that may need to be answered before proceeding with an implementation.

As described, a user has access to a model as long as they share one or more backend roles with the model. We’ve selected this policy because it is consistent with other plugins and simple to support. However, we would like to know the demand to support more advanced policies.

We are aware of certain limitations with the current design. For instance, it’s difficult to decrease a user’s scope of access at scale because it becomes difficult for admins to track a large number of roles and permissions and how they’re associated to models and users.

Secondly, it’s difficult to model the access controls for large organizations. For instance, if I wanted to control access based on teams and job responsibilities, the current design requires the admin to manage approximately #teams X #job responsibilities backend roles. For instance, I would need to model backend roles like TeamA-ModelBuilder and TeamA-MLOps.

If we had the ability to attach backend roles to allow or disallow access, we could model the backend roles like Team A, Team B, ModelBuilder and MLOps and attach the some combination of backend roles associated with teams and job responsibilities to each model group. This would enable large organization to define backend roles in a more scalable and intuitive way.

If there are users that have more advanced requirements, we like to know more about it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions