Skip to content

Conversation

@mariajgrimaldi
Copy link
Member

@mariajgrimaldi mariajgrimaldi commented Jun 25, 2025

Description

Add a follow-up ADR with the decisions regarding how user groups are updated throughout their own lifecycle.

This ADR is result from team conversations during a Spike phase, which produced several architectural documents. You can find them here: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4901404678/User+Groups. This specific ADR is a result of the discussions had in these two documents:

  1. Architectural Design Questions: this was how we started the spike session, laying down all questions about this new unified model.
  2. The result of those discussions were documented here: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4976115715/User+Group+Consistency+and+Refresh+Framework

For detailed implementation information, see the entire technical approach here: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4917657604/Unified+User+Group+Model+Technical+Approach

Here is also a proof of concept (POC) designed to test these decisions, along with the other ADRs: #7

⚠️ I use copilot and cursor with Claude Sonnet 4 for writing and structure improvements.

Merge checklist:
Check off if complete or not applicable:

  • Version bumped
  • Changelog record added
  • Documentation updated (not only docstrings)
  • Fixup commits are squashed away
  • Unit tests added/updated
  • Manual testing instructions provided
  • Noted any: Concerns, dependencies, migration issues, deadlines, tickets

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jun 25, 2025
@openedx-webhooks
Copy link

openedx-webhooks commented Jun 25, 2025

Thanks for the pull request, @mariajgrimaldi!

This repository is currently maintained by @openedx/committers-user-groups.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

🔘 Update the status of your PR

Your PR is currently marked as a draft. After completing the steps above, update its status by clicking "Ready for Review", or removing "WIP" from the title, as appropriate.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@openedx-webhooks openedx-webhooks added the core contributor PR author is a Core Contributor (who may or may not have write access to this repo). label Jun 25, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Jun 25, 2025
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/runtime-architecture-for-evaluation-and-extension-adr branch from 4d2bd76 to aeda019 Compare June 27, 2025 10:16
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/refreshment-and-consistency-adr branch from 66e51dc to d0030ec Compare June 27, 2025 10:40
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/runtime-architecture-for-evaluation-and-extension-adr branch from 8ca6e86 to 080c904 Compare June 27, 2025 12:50
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/refreshment-and-consistency-adr branch from d0030ec to ef50eb8 Compare June 27, 2025 12:50
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/runtime-architecture-for-evaluation-and-extension-adr branch 2 times, most recently from f8775da to 4abee11 Compare June 27, 2025 16:53
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/refreshment-and-consistency-adr branch from ef50eb8 to 5d92f6a Compare June 27, 2025 16:54
Define Exclusivity Domains Through Update Framework
---------------------------------------------------

* **Automatic Exclusivity Domains**: When the criteria of group G1 and group G2 are mutually exclusive (C1, ..., Cn ∩ C'1, ..., C'n = ∅), these groups automatically form a **mutual exclusivity domain** that is managed by the event-based update framework.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for a mechanism to enforce this? If all criteria from all relevant groups are evaluated within the same transaction this would seem to be handled automatically. I'm not sure if this section and the following one are just documenting this property of the system or are suggesting the system will try to add additional checks on top of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutual exclusivity domains apply to criteria that are naturally mutually exclusive. For example, the IsStaff criterion, a user is either staff or not, but never both at the same time. The reason a user doesn't end up in both groups is because we have a mechanism that evaluates all criteria of the same type at once, in a single transaction.

I don't think we need an additional mechanism beyond that, but I might be missing some edge cases. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that holds up in the ORM case, but I'm not sure about what happens with race conditions once we add in external systems whose data may not update in real time. I guess they might, by necessity, force the system out of a mutual exclusivity domain unless we had another mechanism to enforce it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll add this as a note under this section. Thank you!

* **Hybrid Approach**: The combination of Group Collections + refresh & consistency framework guarantees that a user is never in two groups that are mutually exclusive by nature (contradictory), whether the exclusivity is:

* **Natural/Automatic**: Derived from mutually exclusive criteria (handled by update framework)
* **Administrative/Manual**: Defined by course staff or admin users (handled by Group Collections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only case where we might need collections, and I'm not sure it's a requirement. If I'm understanding it correctly, this would cover cases like:

  1. Course admin uploads a list of users who are learning hybrid online/in-person and wants that group to be exclusive of student who have taken an online-only entrance exam

  2. Site admin uploads a list of teaching assistants for one group, but doesn't want it to include people who are course admins on any course

I think this is problematic in a couple of ways:

  • In scenario 1 if a student accidentally takes the exam before being put into the manually uploaded sheet they are forever forbidden from joining the correct group
  • In scenario 2 if a TA becomes a course admin after the sheet is uploaded they will be forbidden from joining the "course admin" group with no way to resolve the issue aside from removing the user from the manual list, having them be re-added as a course admin. If user group errors like this bubble up to the upstream event it could cause the user to not be able to be promoted to a course admin at all, I think?

@crathbun428 can you think of a requirement where we would need this case that's less contrived than what I said above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think that Event-Based Exclusivity Management should be enough to handle the mutual exclusivity examples we had originally talked about - https://openedx.atlassian.net/wiki/spaces/OEPM/pages/5004591109/Mutual+Exclusivity+in+User+Groups#Examples-of-Mutually-Exclusive-Groups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only instance i could think of is if you wanted a group that excluded another group but didn't specify the requirements. like, group1 = 'on-campus users who are not in the at-risk user group (group2)' so instead of specifying all of the at-risk conditions in group1, you would just say group1 and group2 are mutually exclusive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to list the use cases we want to address with explicitly defined mutually exclusive groups. I initially thought of these examples, but I'm definitely missing others like the ones you shared.

What's straightforward about my example is that the user can only be in one group or the other, so it's obvious where they belong. But in cases like: group1 = on-campus users who are not in the at-risk user group (group2), if a user ends up in both groups and they're mutually exclusive, would that mean the user is removed from both? In this kind of scenario, it might be simpler to just use criteria intersections rather than group collections.

I had initially only considered these types of exclusivity:

  1. Negation of a criterion (C ∩ ¬C = ∅)

    • The group collection comes from the criterion itself.
    • Example: "Final Exam Result" as the collection, with groups like Passed Final Exam and Did Not Pass Final Exam.
  2. Unique criteria

    • The group collection is based on a property that is inherently unique per user.
    • Example: "Country of Residence" (Spain, France, Germany) or "Language Proficiency" (Beginner, Intermediate, Advanced).
  3. Arbitrary or random assignment

    • The group collection is created for controlled assignment.
    • Example: "A/B Test Groups" (Experiment A, Experiment B), where each user is randomly assigned to one group.

Can you think of others?

For your examples (hybrid learners vs. exam takers, or TAs vs. course admins), they fit best into the negation/intersection case. Event-based exclusivity should be enough to handle those without needing explicit collections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariajgrimaldi - why do we need a collection for an automatic exclusivity? for your example using enrollment mode, if a user changes from 'audit' to 'honor' then the code should automatically move them from one group to another according to the group definitions, i don't see why we would need a collection for these groups?

i think Ty was specifically talking about manually defined exclusivity groups and if this would work/is necessary. it seems like the only case that a manually defined collection would be needed is the A/B test group but maybe that could be done with a csv upload?


* **External data**: Account type groups ("Free Tier", "Premium", "Enterprise") updated from external billing system daily - all updated together in the same batch operation
* **Missing events**: User skill level groups ("Beginner", "Intermediate", "Advanced") where skill assessment data exists but events aren't implemented yet - updated together via scheduled refresh
* **Performance constraints**: Heavy analytics-based groups that are too expensive to update in real-time - updated together during off-peak hours
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think based on what was said above, these would be triggered by any of the other refresh cadences as well since triggering any criteria in a group triggers an evaluation of all criteria for that group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! I'll drop this line


To give operators flexibility in managing the refresh framework, we will:

* **Group Freezing**: Allow freezing updates for a group (stop all refreshes temporarily), useful for operational debugging or data stability. Frozen groups will not be visible to the orchestrator until unfrozen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would these overrides be stored? I don't think there is currently a place for them on the group model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the same as making a group 'inactive'?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier (?) to allow you to make a point-in-time copy of a group (so say you have dynamic-group-a that refreshes daily, and I could make a copy of that group, dynamic-group-a-{timestamp} that is the copy of that group at a specific moment, but that would allow the dynamic group to still exist)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the same as making a group 'inactive'?

Yes! I was referring to disabling a group, although this would have different consequences, not only stopping updates.

Would it be easier (?) to allow you to make a point-in-time copy of a group (so say you have dynamic-group-a that refreshes daily, and I could make a copy of that group, dynamic-group-a-{timestamp} that is the copy of that group at a specific moment, but that would allow the dynamic group to still exist)

In this case, where a group is misbehaving in some way, if we allow the group to exist, it could negatively impact the student experience if the group was used for content gating, for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I was referring to disabling a group, although this would have different consequences, not only stopping updates.

What other consequences are there?


* **Group Freezing**: Allow freezing updates for a group (stop all refreshes temporarily), useful for operational debugging or data stability. Frozen groups will not be visible to the orchestrator until unfrozen.

* **Frequency Overrides**: Allow operational overrides of refresh frequencies for individual groups or criteria when needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are allowing overrides at the criteria level we would need a place to store this as well, but I'm not sure if that complicates things too much. Since we've said that if any other criteria is being triggered they should also be triggered as well I'm not sure if this is possible, or if we're saying we would just ignore that requirement in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to drop this since it's too much of a requirement and we don't have a specific use case for this at the moment.

@mphilbrick211 mphilbrick211 moved this from Needs Triage to In Eng Review in Contributions Jul 15, 2025

* **Frequency Overrides**: Allow operational overrides of refresh frequencies for individual groups or criteria when needed.

* **Method Restrictions**: Support restricting groups to a single update method (event-only, scheduled-only, or manual-only) when operationally required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this saying you could force an event-driven criterion to be refreshed on a schedule? wouldnt that involve changing the whole backend of the criterion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each criterion should support all three update strategies if possible (manual updates don’t make much sense in most cases). However, we can limit a group to update only on specific events or on a scheduled basis. For example, if a group is very large and gets updated too often, we can switch it to scheduled-only updates to avoid putting too much load on the system.

Now this would mean we should enable update strategies overrides at some point, which makes things more complicated.

I think for now we should go with only the update strategy configured in the criteria, nothing else. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes if we only use the update strategy configured in the criteria then that makes sense

* Trigger re-evaluation when any criterion's update frequency threshold is reached (scheduled update). Example: If C1 is event-based and C2 is cached daily, the group is refreshed:

* Immediately on C1 events.
* On scheduled daily refresh for C2 (unless already refreshed by C1 events).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unless already refreshed by C1 events).

Would this be more difficult to implement than just doing a scheduled refresh regardless of C1 events? If I know that C2 events are processed at 1800 each night so I schedule a mailing after 1800 expecting the freshest data, but a C1 event happened at 1802 the day before, would there actually not be a refresh of the criterion as I had expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm totally. You're right about this creating an inconsistency regarding fo what the user actually expects. I'll remove the annotation and replace it with regardless of C1 events 1ea8c26

* Immediately on C1 events.
* On scheduled daily refresh for C2 (unless already refreshed by C1 events).

* Set refresh frequency per criterion type based on data volatility and system performance requirements, as outlined in the long-term requirements.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be tunable by site operators? (not sure if necessary just curious)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


To give operators flexibility in managing the refresh framework, we will:

* **Group Freezing**: Allow freezing updates for a group (stop all refreshes temporarily), useful for operational debugging or data stability. Frozen groups will not be visible to the orchestrator until unfrozen.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier (?) to allow you to make a point-in-time copy of a group (so say you have dynamic-group-a that refreshes daily, and I could make a copy of that group, dynamic-group-a-{timestamp} that is the copy of that group at a specific moment, but that would allow the dynamic group to still exist)

@mariajgrimaldi mariajgrimaldi force-pushed the MJG/runtime-architecture-for-evaluation-and-extension-adr branch from 4abee11 to c29e7a8 Compare July 24, 2025 13:01
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/refreshment-and-consistency-adr branch from 5d92f6a to e6b794a Compare July 24, 2025 13:59
@mariajgrimaldi mariajgrimaldi requested review from bmtcril and sarina July 30, 2025 11:48
@mariajgrimaldi mariajgrimaldi force-pushed the MJG/refreshment-and-consistency-adr branch from 0b1ca0b to 3e4aa5a Compare July 30, 2025 12:09
@mphilbrick211
Copy link

Friendly ping on this @saraburns1!

@sarina
Copy link

sarina commented Sep 16, 2025

@mariajgrimaldi can you put the FC number in the title of FC prs (this and the other User Groups PRs) to help Michelle triage appropriately?

Also I think we're holding on user groups so maybe these should be converted to draft?

@sarina sarina added the FC Relates to an Axim Funded Contribution project label Sep 16, 2025
@bmtcril bmtcril marked this pull request as draft September 22, 2025 14:15
@mphilbrick211 mphilbrick211 moved this from In Eng Review to Waiting on Author in Contributions Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core contributor PR author is a Core Contributor (who may or may not have write access to this repo). FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Status: Waiting on Author

Development

Successfully merging this pull request may close these issues.

7 participants