Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Implement Cloud communication reliability #32856

Merged
merged 20 commits into from
Oct 17, 2024

Conversation

Gustrb
Copy link
Contributor

@Gustrb Gustrb commented Jul 22, 2024

Proposed changes (including videos or screenshots)

When a Workspace Registers with Cloud it gets a set of credentials it uses to generate access tokens and communicate with Cloud. Can read more about how this works in ADR3.

When a workspace owner needs to scale and deploys multiple Rocket.Chat instances to make up their deployment. Each workspace establishes a change stream request to mongo watching settings. Then updates its collection cache locally.

When a workspace goes to use an access token to talk to a cloud service it first evaluates if the token needs renewed. If it does it request a new one. Then updates a setting containing the access token. Then also another setting containing the expire.

The problem occurs when latency enters the picture. If due to either longer mongo query or network condition the other workspaces don't see both of these changes happen and update its cache. An instance can be using an expired token instead of renewing because the setting cache hasn't been updated.

We've seen many cases of this happen and typically restarting workspace can resolve.

This PR moves all credential related data to the 'workspace_credentials' collection, and stop using settings.get to fetch the configs that can be changed on multiple server instances.
This is a new PR that was created from both #32122 & #31986 in those PRs the migration was separated from the code itself, but that would end up being really hard to get merged properly, since the migration required the code, and the code assumed the migration was already ran, I'd guess we could avoid it by doing some weird PR splitting, but I am not really sure we would benefit a lot from it.

Issue(s)

Steps to test or reproduce

Create a new workspace, on registering, it should create a new record inside the rocketchat_workspace_credentials collection, containing an empty date as expirationDate, an empty string as accessToken and an empty string as scopes.
To test everything else, you must use the cloud functions, not sure how to call them from RC, but you could enter the Marketplace screen, go to paid apps, open one of them, and check if there is a new token in the database.
I would love some suggestions on how we can implement API tests for this feature, I am not sure at all if it is feasible (or if we should add later)

Further comments

This implements: https://adr.rocket.chat/0061

CONN-5

@Gustrb Gustrb added this to the 7.0 milestone Jul 22, 2024
Copy link
Contributor

dionisio-bot bot commented Jul 22, 2024

Looks like this PR is ready to merge! 🎉
If you have any trouble, please check the PR guidelines

Copy link

changeset-bot bot commented Jul 22, 2024

🦋 Changeset detected

Latest commit: 15f9f51

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages
Name Type
@rocket.chat/meteor Major
@rocket.chat/core-typings Major
@rocket.chat/model-typings Major
@rocket.chat/models Major
@rocket.chat/uikit-playground Patch
@rocket.chat/api-client Patch
@rocket.chat/apps Patch
@rocket.chat/core-services Patch
@rocket.chat/cron Patch
@rocket.chat/ddp-client Patch
@rocket.chat/freeswitch Patch
@rocket.chat/fuselage-ui-kit Major
@rocket.chat/gazzodown Major
@rocket.chat/livechat Patch
@rocket.chat/rest-typings Major
@rocket.chat/ui-contexts Major
@rocket.chat/account-service Patch
@rocket.chat/authorization-service Patch
@rocket.chat/ddp-streamer Patch
@rocket.chat/omnichannel-transcript Patch
@rocket.chat/presence-service Patch
@rocket.chat/queue-worker Patch
@rocket.chat/stream-hub-service Patch
@rocket.chat/license Patch
@rocket.chat/omnichannel-services Patch
@rocket.chat/pdf-worker Patch
@rocket.chat/presence Patch
rocketchat-services Patch
@rocket.chat/instance-status Patch
@rocket.chat/network-broker Patch
@rocket.chat/ui-avatar Major
@rocket.chat/ui-client Major
@rocket.chat/ui-video-conf Major
@rocket.chat/ui-voip Major
@rocket.chat/web-ui-registration Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

codecov bot commented Jul 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.85%. Comparing base (f63d8e2) to head (15f9f51).
Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           develop   #32856   +/-   ##
========================================
  Coverage    74.85%   74.85%           
========================================
  Files          470      470           
  Lines        20743    20743           
  Branches      5294     5294           
========================================
  Hits         15528    15528           
  Misses        4595     4595           
  Partials       620      620           
Flag Coverage Δ
unit 74.85% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@Gustrb Gustrb marked this pull request as ready for review July 22, 2024 16:29
@Gustrb Gustrb requested review from a team as code owners July 22, 2024 16:29
@Gustrb Gustrb requested review from ggazzo and debdutdeb July 23, 2024 11:51
@ggazzo ggazzo requested a review from a team as a code owner July 25, 2024 02:17
@ggazzo ggazzo force-pushed the release-7.0.0 branch 3 times, most recently from ce44882 to d94b784 Compare July 29, 2024 19:47
@ggazzo ggazzo force-pushed the release-7.0.0 branch 2 times, most recently from 30f6f98 to 8c447cf Compare August 16, 2024 13:09
@ggazzo ggazzo requested a review from a team as a code owner September 5, 2024 14:58
@ggazzo ggazzo force-pushed the release-7.0.0 branch 3 times, most recently from c6874ac to fff2293 Compare September 11, 2024 22:14
@ggazzo ggazzo force-pushed the release-7.0.0 branch 4 times, most recently from 9ba3bd2 to fcd8105 Compare September 16, 2024 01:13
@ggazzo ggazzo force-pushed the release-7.0.0 branch 2 times, most recently from e9bbdbd to cb512e1 Compare September 25, 2024 16:15
@Gustrb Gustrb changed the title feat: Implement Cloud communication reliability feat!: Implement Cloud communication reliability Sep 27, 2024
@Gustrb Gustrb added the stat: QA assured Means it has been tested and approved by a company insider label Oct 17, 2024
@dionisio-bot dionisio-bot bot added the stat: ready to merge PR tested and approved waiting for merge label Oct 17, 2024
@Gustrb Gustrb added stat: QA assured Means it has been tested and approved by a company insider and removed stat: ready to merge PR tested and approved waiting for merge stat: QA assured Means it has been tested and approved by a company insider labels Oct 17, 2024
@dionisio-bot dionisio-bot bot added the stat: ready to merge PR tested and approved waiting for merge label Oct 17, 2024
@kodiakhq kodiakhq bot merged commit b338807 into develop Oct 17, 2024
50 checks passed
@kodiakhq kodiakhq bot deleted the feat/implement-adr-0061 branch October 17, 2024 22:56
This was referenced Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat: QA assured Means it has been tested and approved by a company insider stat: ready to merge PR tested and approved waiting for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants