Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PM-3007] Caching user policies on PolicyService variable #3117

Conversation

r-tome
Copy link
Contributor

@r-tome r-tome commented Jul 18, 2023

Type of change

- [ ] Bug fix
- [ ] New feature development
- [X] Tech debt (refactoring, code cleanup, dependency upgrades, etc)
- [ ] Build/deploy pipeline (DevOps)
- [ ] Other

Objective

Enhance performance and reduce SQL operations by modifying the stored procedure OrganizationUser_ReadByUserIdWithPolicyDetails to retrieve all policies and filter them in memory for each type, thereby minimizing the number of SQL operations required. The filtered policies will be cached and utilized by the scoped service IPolicyService.

Code changes

  • src/Core/Repositories/IOrganizationUserRepository.cs: Removed the parameter PolicyType so that the method returns all policy types
  • src/Core/Services/Implementations/PolicyService.cs: Added a private variable to cache the obtained user policies from the repository
  • src/Infrastructure.Dapper/Repositories/OrganizationUserRepository.cs: Removed the parameter PolicyType so that the method returns all policy types
  • src/Infrastructure.EntityFramework/Repositories/OrganizationUserRepository.cs: Removed the parameter PolicyType so that the method returns all policy types
  • src/Sql/dbo/Stored Procedures/OrganizationUser_ReadByUserIdWithPolicyDetails.sql: Removed the parameter PolicyType so that the method returns all policy types and also replaced the nested query with left joins
  • test/Core.Test/Services/PolicyServiceTests.cs: Updated unit tests
  • test/Infrastructure.EFIntegration.Test/Repositories/OrganizationUserRepositoryTests.cs: Updated unit tests
  • util/Migrator/DbScripts/2023-07-18_00_OrganizationUserReadByUserIdWithPolicyDetails.sql: SQL migration script

Before you submit

  • Please check for formatting errors (dotnet format --verify-no-changes) (required)
  • If making database changes - make sure you also update Entity Framework queries and/or migrations
  • Please add unit tests where it makes sense to do so (encouraged but not required)
  • If this change requires a documentation update - notify the documentation team
  • If this change has particular deployment requirements - notify the DevOps team

@r-tome
Copy link
Contributor Author

r-tome commented Jul 18, 2023

I've placed the cached policies inside PolicyService but maybe it's more suited inside CurrentContext?

@bitwarden-bot
Copy link

bitwarden-bot commented Jul 18, 2023

Logo
Checkmarx One – Scan Summary & Detailsae9a48e1-7521-477a-b6d5-26c4591563e5

New Issues

Severity Issue Source File / Package Checkmarx Insight
MEDIUM CSRF /src/Api/SecretsManager/Controllers/ServiceAccountsController.cs: 60 Attack Vector

@eliykat
Copy link
Member

eliykat commented Jul 18, 2023

I've placed the cached policies inside PolicyService but maybe it's more suited inside CurrentContext?

CurrentContext is already doing too much in my opinion. I think it's much better to have this cached within the domain-specific service, which is exactly what you've done.

I'm curious whether you've compared the query execution plans for the old & new sproc? I'm sure we're going to see some boost from the caching either way, so I'd like to know that we've also improved the sproc itself. If you want any help with this let me know.

@r-tome
Copy link
Contributor Author

r-tome commented Jul 19, 2023

CurrentContext is already doing too much in my opinion. I think it's much better to have this cached within the domain-specific service, which is exactly what you've done.

That is what I thought but I needed validation 😅

I'm curious whether you've compared the query execution plans for the old & new sproc? I'm sure we're going to see some boost from the caching either way, so I'd like to know that we've also improved the sproc itself. If you want any help with this let me know.

I did compare them! Here are both execution plans (old on top):

image

I think its improved by having less nested loops but that my non-expert opinion. Let me know your thoughts.

@r-tome r-tome marked this pull request as ready for review July 19, 2023 10:19
@withinfocus
Copy link
Contributor

How'd you get these query plans by chance? You'll need the actual plans from production.

@r-tome
Copy link
Contributor Author

r-tome commented Jul 19, 2023

How'd you get these query plans by chance? You'll need the actual plans from production.

That is true, who can I ping to get those? These are just from my local dev environment.

@withinfocus
Copy link
Contributor

Heh yeah I was asking the same questions to CloudOps and there wasn't an answer. For your sake I think you need to make an ad hoc request for them to jump in and pull some actual execution plans for you.

@eliykat
Copy link
Member

eliykat commented Jul 20, 2023

@withinfocus I was aware of that limitation, but I thought using local query plans would be the next best thing. But comparing them, they're not even close, so that's good to know.

I guess that means we also won't know of any measurable improvements until the new sproc has been running in prod for a while?

@withinfocus
Copy link
Contributor

In our situation that's pretty much correct. The usual course of action is to snapshot the table or simulate size and fragmentation so that it could be replicated elsewhere, but we are still adopting better database testing practices.

@MGibson1
Copy link
Member

I guess that means we also won't know of any measurable improvements until the new sproc has been running in prod for a while?

It's worth checking qa cloud to see if that matches a little better, but this is probably true.

kspearrin
kspearrin previously approved these changes Jul 31, 2023
Copy link
Member

@kspearrin kspearrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an acceptable improvement, but there is certainly much more than can be done to optimize the query within the database alone. For example, I see lots of index scanning that have a large cost to the overall execution plan.

…organization-user-read-by-user-id-with-policy-details-sproc

# Conflicts:
#	src/Core/Repositories/IOrganizationUserRepository.cs
@r-tome
Copy link
Contributor Author

r-tome commented Aug 1, 2023

@kspearrin I experimented with adding two new indexes and changing an existing one to include some necessary columns.

CREATE NONCLUSTERED INDEX [IX_ProviderUser_UserIdProviderIdStatus]
        ON [dbo].[ProviderUser]([UserId] ASC, [ProviderId] ASC, [Status] ASC)
        WITH (ONLINE = ON);

CREATE NONCLUSTERED INDEX [IX_ProviderOrganization_ProviderIdOrganizationId]
        ON [dbo].[ProviderOrganization]([ProviderId] ASC, [OrganizationId] ASC);
        WITH (ONLINE = ON);

CREATE NONCLUSTERED INDEX [IX_OrganizationUser_UserIdOrganizationIdStatusEmail]
            ON [dbo].[OrganizationUser]([UserId] ASC, [OrganizationId] ASC, [Status] ASC, [Email] ASC)
            INCLUDE ([AccessAll], [Type], [Permissions])
            WITH (ONLINE = ON);
-- Drop existing index that did not include the [Email] column
DROP INDEX [IX_OrganizationUser_UserIdOrganizationIdStatus] ON [dbo].[OrganizationUser];

Here is the old plan when I ran the query locally:
image

Here is the new plan with the indexes changes:
image

With these changes I managed to reduce the amount of index scans but of course this is only my local environment so comparing the actual numbers is pointless.

Do you think we should try to add the indexes?

@kspearrin
Copy link
Member

@r-tome I think those are good results, however, let's hold off on adding lots of indexes right now until we have the right people available to evaluate our indexing strategy. We can always add them on-demand if this query becomes more of a problem in production.

kspearrin
kspearrin previously approved these changes Aug 1, 2023
@bitwarden-devops-bot bitwarden-devops-bot temporarily deployed to QA Cloud August 3, 2023 14:45 Inactive
WHERE U.[Id] = @UserId AND OU.[Email] = U.[Email] AND OU.[Status] = 0 -- 'Invited' OrgUsers are not linked to a UserId yet, so we have to look up their email
)
END
GO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Newline at the end.

OR EXISTS (
SELECT 1
FROM [dbo].[UserView] U
WHERE U.[Id] = @UserId AND OU.[Email] = U.[Email] AND OU.[Status] = 0 -- 'Invited' OrgUsers are not linked to a UserId yet, so we have to look up their email
)
END
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Newline at the end.

@r-tome r-tome merged commit 78588d0 into master Aug 3, 2023
44 checks passed
@r-tome r-tome deleted the PM-3007-resolve-high-db-resource-usage-by-organization-user-read-by-user-id-with-policy-details-sproc branch August 3, 2023 17:36
eliykat added a commit that referenced this pull request Aug 16, 2023
eliykat added a commit that referenced this pull request Aug 16, 2023
… SQL CPU (#3203)

* Revert "[PM-3007] Caching user policies on PolicyService variable (#3117)"

This reverts commit 78588d0.

* Don't delete old migration script

* Add migration to revert sproc
vgrassia pushed a commit that referenced this pull request Aug 16, 2023
… SQL CPU (#3203)

* Revert "[PM-3007] Caching user policies on PolicyService variable (#3117)"

This reverts commit 78588d0.

* Don't delete old migration script

* Add migration to revert sproc

(cherry picked from commit fc814ff)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants