-
Couldn't load subscription status.
- Fork 337
Immutable user object #5212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immutable user object #5212
Conversation
a1b9952 to
e7e5d56
Compare
Benchmark ResultsWe did some initial benchmarks on the new implementation, here are the results. DisclaimerThe usual benchmarking disclaimer: Of course, these figures can only represent very specific scenarios. Other scenarios can look quite different, as there are so many variables which can be different. Yet, especially the "slow parts" of the benchmark results can give one an impression where real performance issues are. Test setup
K8s resources per node
Tested dimensionsUsersWe compared four different users:
The number of internal attributes influences the size of the user object, both unserialized and serialized. Thus, it influences the resources needed for serialization, deserialization and also the network traffic for inter-node communication. It would have been also possible to test with different numbers of roles. Both variables would influence the serialization performance in similar ways. Users with 1000 attributes might be an extreme case. Still, especially users authenticated by Active Directory/LDAP might get (sometimes unknown to administrators) quite a lot of attributes from the LDAP backend. The user authenticated by admin certificate bypasses most parts of the security plugin code (although not the serialization code) and can be thus seen as a upper threshold for what can be achieved. OperationsWe tested four different operations:
Test resultsA commented summary of the results follows in the upcoming sections. The raw test results can be also reviewed at https://docs.google.com/spreadsheets/d/1HeJ8HFOqF5S4FEDiLXZ4d8eniU9UCp_-FJ3XBrojVME/edit?usp=sharing Indexing throughputBulk size 10In this and the following charts, the dashed lines represent OpenSearch with the standard security plugin. The solid lines represent OpenSearch with the security plugin with the optimized privilege evaluation code. The blue line represents requests authenticated by the super admin certificate, which by-passes most of the security plugin. Thus, the blue line forms a kind of "hull curve", it can be seen as a rough theoretical maximum from a security plugin point of view. We see that users with 10 attributes experience a throughput improvement of about 6% with the new implementation. Users with 100 attributes get an 18% improvement. The extreme case where users have 1000 attributes sees a 46% improvement. While the new implementation shows significant improvements, we still see a declining throughput with a growing number of user attributes. This is expected, as we only optimize the process of serialization/deserialization. The increased network load stays unchanged compared to the old implementation. Bulk size 100Bulk requests with more items can be processed more efficiently. Thus, as we are looking at a docs/s throughput, the improvements are a bit smaller in this case. Users with 10 attributes experience a 2.5% improvement with the new implementation. Users with 100 attributes see an 8% improvement. For 1000 attributes, we get a 32% improvement. Search throughputSingle indexIn this benchmark, we compare the operation throughput for search requests performed on a single index. For users with 10 attributes, we only see a small throughput improvement of 1.2%. Users with 100 attributes get an improvement of 13%. Users with 1000 attributes get an improvement of 83%. 20 indicesThis benchmark tests searches operations that search through 20 indices at once (specified by an index pattern). Compared to the single index case, we see much more pronounced improvements. This is likely due to the increased inter-node communication required by searching on several indices with several shards distributed over a cluster. Users with 10 attributes see a 21% improvement. Users with 100 attributes get a throughput that is 39% higher than the on the old version. Users with 1000 attributes see an improvement of 384%. |
5265344 to
91101dd
Compare
|
@cwperks @kkhatua @DarshitChanpura I'd like to hear an opinion from you regarding the following: In the immutable user object implementation, we need to use the following quite inefficient code to create serialized forms of the user object that are backwards compatible. That's especially the wrapping of collections inside security/src/main/java/org/opensearch/security/user/User.java Lines 361 to 384 in 7a49f7c
Due to this, I am leaning towards again using a custom serialization protocol for the immutable user object. Thus, in perspective, we could retire the wrapping inside |
yes, that was the case and led to large duplication when using custom serialization for that data structure: opensearch-project/OpenSearch#11430
+1 to this if you are seeing noticeable improvements with the new format. |
I believe the performance improvements will be noticeable in graphs, but not as pronounced as the improvements we see already now. I rather look at this from a code quality perspective: Without changing the protocol at one point, we have to carry the baggage of doing this ... ... without a real reason to use |
5c33b8c to
0a0f843
Compare
Signed-off-by: Nils Bandener <nils.bandener@eliatra.com>
Signed-off-by: Nils Bandener <nils.bandener@eliatra.com>
f66e43d to
b30c1ac
Compare
Signed-off-by: Nils Bandener <nils.bandener@eliatra.com>
Signed-off-by: Nils Bandener <nils.bandener@eliatra.com>
src/main/java/org/opensearch/security/privileges/PrivilegesEvaluator.java
Show resolved
Hide resolved
Signed-off-by: Nils Bandener <nils.bandener@eliatra.com>
|
@cwperks @kkhatua @DarshitChanpura @shikharj05 @kumargu This should be now ready, I would be curious for your opinions :) Regarding the serialization format discussed in #5212 (comment) : I did not change anything in this regard so far. If we want to change it, we can do it both in this PR or also in a separate PR. |
src/integrationTest/java/org/opensearch/security/privileges/ActionPrivilegesTest.java
Show resolved
Hide resolved
src/main/java/org/opensearch/security/auth/internal/NoOpAuthorizationBackend.java
Show resolved
Hide resolved
src/main/java/org/opensearch/security/auth/internal/InternalAuthenticationBackend.java
Show resolved
Hide resolved
|
I'll start taking a look at this tomorrow. |
|
Thank you, @nibix, for the excellent work. If all comments have been addressed, I'll go ahead and merge it |
Description
Fixes #5168
This PR makes the User object immutable. This brings a number of significant advantages:
An immutable object means that the binary data created by serialization does not change as well. That means, the serialized data can be computed once per user object and then re-used again and again. The existing authentication cache even allows us to re-use the serialized data across different requests by the same user.
Additionally, a cache similar to the cache maintained by BackendRegistry can be utilized to also cut down the number of de-serialization operations to a single one per user per node. Such a cache would map the serialized binary data to the actual user object.
Immutable objects are inherently thread-safe. That makes any synchronization or locking mechanisms on the user object unnecessary.
Issues Resolved
Fixes #5168
Is this a backport? If so, please add backport PR # and/or commits #, and remove
backport-failedlabel from the original PR.Do these changes introduce new permission(s) to be displayed in the static dropdown on the front-end? If so, please open a draft PR in the security dashboards plugin and link the draft PR here
Testing
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.