Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: don't run Celery workers in dev mode #1041

Closed
wants to merge 1 commit into from

Conversation

arbrandes
Copy link

@arbrandes arbrandes commented Apr 16, 2024

Tutor's importing * from devstack.py[1] for the development settings, and that means that we aren't using Celery workers at all in dev mode (see [2]). This removes them from the dev compose file, thus saving everyboding a significant chunk of RAM.

[1] https://github.com/overhangio/tutor/blob/master/tutor/templates/apps/openedx/settings/lms/development.py#L3
[2] https://github.com/openedx/edx-platform/blob/master/lms/envs/devstack.py#L35

To do so, we rely on Docker compose profiles[3].

[3] https://docs.docker.com/compose/profiles/

BREAKING CHANGE: the COMPOSE_PROJECT_STARTED hook signature had to be changed to accomodate profile selection.

@arbrandes arbrandes marked this pull request as draft April 16, 2024 19:32
@arbrandes arbrandes changed the title feat: don't run Celery workers in dev mode feat!: don't run Celery workers in dev mode Apr 16, 2024
@arbrandes arbrandes marked this pull request as ready for review April 16, 2024 21:16
@arbrandes arbrandes requested a review from regisb April 16, 2024 21:19
@@ -0,0 +1,2 @@
- 💥[Feature] Use Docker compose profiles to control services. (by @arbrandes) -->
- [Fix] Don't start Celery workers in dev mode, as they're never used. (by @arbrandes) -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need some sort of guidelines to understand how the workers can be setup in dev mode, especially now that we are using docker compose profiles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a simple tutor dev dc start lms-worker would start the service, for instance, but the workers wouldn't be actually usable without modifying settings.py. Is this what you mean we should document?

Of course, the other question would be: why would anybody want to do this in a dev environment?

Copy link
Contributor

@DawoudSheraz DawoudSheraz Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the command and setting changes, yes. I get the point that most devs might not be using this, we can have the doc in a followup. It is not a blocker.

@@ -171,6 +171,8 @@ services:
{%- endfor %}
depends_on:
- lms
profiles:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't familiar with profiles, to be honest. I like the idea of not running workers in dev, but I'd like to avoid making a breaking change to the filter API, and also changes across many python functions.

Instead, could we simply move the lms-worker and cms-worker declarations from docker-compose.yml to docker-compose.prod.yml?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's going to be that simple. There are other places in the code that assume those service definitions exist. For instance, https://github.com/overhangio/tutor/blob/master/tutor/commands/compose.py#L289. Plus, I suspect profiles might be useful down the road, woudn't you say so?

If you're dead set against it, though, there is another way: we can just set the worker services in the dev compose override file to a profile like "donotstart". We get the same thing this PR provides, but without actually supporting different profiles.

Copy link
Contributor

@DawoudSheraz DawoudSheraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me, though I will keep this open until Régis is back.

@kdmccormick
Copy link
Collaborator

Haha, I have a competing proposal (although I let it languish, because I was never able to confirm or deny that pdb would work on these workers):

My thinking was that it would be good to run these tasks asynchronously, because it would expose developers to the same race conditions that inevitably always happen with Celery tasks in production. But I can see the other side of the argument: async tasks are harder to debug and the workers use resources.

Either way, I think we can all agree that we should either use these workers, or turn them off. So I see four paths forward:

  1. Use this PR, which will disable workers in dev. Open edX contributors who work on Celery tasks should take note to manually test their code using tutor local.
  2. Use my PR, which will enable Celery tasks by default for everybody.
  3. Create a new Tutor ASYNC_CELERY_TASKS configuration flag, enabled by default. Disabling it would remove the Celery workers from docker-compose and configure Django to run the tasks in-process.
  4. Same as (3), but implement it as a tutor-celery plugin. Like tutor-mfe, it would be installed and enabled by default.

My vote is (1), but I would support any of them.

@DawoudSheraz
Copy link
Contributor

Haha, I have a competing proposal (although I let it languish, because I was never able to confirm or deny that pdb would work on these workers):

My thinking was that it would be good to run these tasks asynchronously, because it would expose developers to the same race conditions that inevitably always happen with Celery tasks in production. But I can see the other side of the argument: async tasks are harder to debug and the workers use resources.

Either way, I think we can all agree that we should either use these workers, or turn them off. So I see four paths forward:

  1. Use this PR, which will disable workers in dev. Open edX contributors who work on Celery tasks should take note to manually test their code using tutor local.
  2. Use my PR, which will enable Celery tasks by default for everybody.
  3. Create a new Tutor ASYNC_CELERY_TASKS configuration flag, enabled by default. Disabling it would remove the Celery workers from docker-compose and configure Django to run the tasks in-process.
  4. Same as (3), but implement it as a tutor-celery plugin. Like tutor-mfe, it would be installed and enabled by default.

My vote is (1), but I would support any of them.
My take on all points:

  1. This seems like the most viable solution. However, I also 2nd the fact that devs should be able to run the tasks async if they want to. If they can do so using local, good enough. But it will confuse devs as we suggest dev for primarily development purposes. Switching to local results in images being rebuilt and all, which can be infuriating if you are trying to debug.
  2. Already discussed in 1
  3. Régis has argued to avoid adding new configs and instead add filters/plugins if needed.
  4. How do you envision the plugin would work/would be responsible for within the ecosystem? To relate with 1, if we can use the plugin for this purpose, it would be nice. I suspect we would be using the plugin to override IDA's celery settings and also allow celery configs to be added without modification to the core (sort-of similar situation Régis is describing on feat: Celery worker concurrency setting #1010 (comment))

Tutor's importing * from devstack.py[1] for the development settings,
and that means that we aren't using Celery workers at all in dev mode
(see [2]).  This makes it so they don't start in dev mode, thus saving
everyboding a significant chunk of RAM.

[1] https://github.com/overhangio/tutor/blob/master/tutor/templates/apps/openedx/settings/lms/development.py#L3
[2] https://github.com/openedx/edx-platform/blob/master/lms/envs/devstack.py#L35

To do so, we rely on Docker compose profiles[3].

[3] https://docs.docker.com/compose/profiles/

BREAKING CHANGE: the `COMPOSE_PROJECT_STARTED` hook signature had to be changed to
accomodate profile selection.
@DawoudSheraz
Copy link
Contributor

@arbrandes Hi, what's the plan for this PR? Thanks

@arbrandes
Copy link
Author

@DawoudSheraz, good question. I still used it locally all the time. I figure I could implement the alternative I describe above that has a smaller footprint (setting a fake docker profile for the workers in the dev override docker-compose), which addresses Regis' concerns and gets us the same immediate effect of not having workers in dev. (But we lose the option of using profiles for other things in the future.)

I'll issue a separate PR so we can compare.

regisb added a commit that referenced this pull request Oct 1, 2024
In development, edx-platform runs with CELERY_ALWAYS_EAGER=True, which
means that lms-worker and cms-worker never catch celery tasks!

This change was heavily inspired by: #1041
regisb added a commit that referenced this pull request Oct 1, 2024
In development, edx-platform runs with CELERY_ALWAYS_EAGER=True, which
means that lms-worker and cms-worker never catch celery tasks!

This change was heavily inspired by: #1041
@regisb
Copy link
Contributor

regisb commented Oct 1, 2024

Hey Adolfo, your PR completely fell off my radar, I'm sorry about that. A recent issue (#1126) has drawn my attention on celery again. I found a not-so-hackish way of disabling lms-worker and cms-worker in development, without using profiles. You can check out my PR here: #1128

While I do agree that profiles are super interesting, and we should probably use them in the future instead of multiple docker-compose files, I think it's a much bigger change that needs extensive testing. Getting rid of celery workers in dev is an obvious improvement for everyone, and I don't want to wait longer to merge it.

regisb added a commit that referenced this pull request Oct 8, 2024
In development, edx-platform runs with CELERY_ALWAYS_EAGER=True, which
means that lms-worker and cms-worker never catch celery tasks!

This change was heavily inspired by: #1041
@regisb
Copy link
Contributor

regisb commented Oct 8, 2024

Closing in favour of #1128.

@regisb regisb closed this Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Won't fix
Development

Successfully merging this pull request may close these issues.

4 participants