Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.1.0rc1 celery issue - Received unregistered task of type 'reports.scheduler'. #29708

Closed
3 tasks done
padbk opened this issue Jul 26, 2024 · 5 comments · Fixed by #29862
Closed
3 tasks done

4.1.0rc1 celery issue - Received unregistered task of type 'reports.scheduler'. #29708

padbk opened this issue Jul 26, 2024 · 5 comments · Fixed by #29862
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature

Comments

@padbk
Copy link
Contributor

padbk commented Jul 26, 2024

Bug description

Getting the following every minute on the worker node:

[2024-07-26 09:46:17,568: ERROR/MainProcess] Received unregistered task of type 'reports.scheduler'.
The message has been ignored and discarded.

Did you remember to import the module containing this task?
Or maybe you're using relative imports?

Please see
https://docs.celeryq.dev/en/latest/internals/protocol.html
for more information.

The full contents of the message body was:
b'[[], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]' (77b)

The full contents of the message headers:
{'lang': 'py', 'task': 'reports.scheduler', 'id': '2ef01d9c-99ce-47e5-92b5-aeccc773aa66', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '2ef01d9c-99ce-47e5-92b5-aeccc773aa66', 'parent_id': None, 'argsrepr': '()', 'kwargsrepr': '{}', 'origin': 'gen41@superset-celerybeat-xxxxxx', 'ignore_result': False, 'replaced_task_nesting': 0, 'stamped_headers': None, 'stamps': {}}

The delivery info for this task is:
{'exchange': '', 'routing_key': 'celery'}
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/celery/worker/consumer/consumer.py", line 659, in on_task_received
    strategy = strategies[type_]
KeyError: 'reports.scheduler'

And no reports run at all.

I don't have this issue in 4.0.2 with the same config

How to reproduce the bug

Installed 4.1.0rc1-py310 on k8s

class CeleryConfig:
      broker_url = 'rediss://%s:%s/%s?ssl_cert_reqs=CERT_NONE' % (REDIS_HOST, REDIS_PORT, REDIS_CELERY_DB)
      imports = ('superset.sql_lab', "superset.tasks", "superset.tasks.thumbnails", )
      result_backend = 'rediss://%s:%s/%s?ssl_cert_reqs=CERT_NONE' % (REDIS_HOST, REDIS_PORT, REDIS_CELERY_DB)
      worker_log_level = "DEBUG"
      worker_prefetch_multiplier = 4
      task_acks_late = True
      task_annotations = {
          "sql_lab.get_sql_results": {"rate_limit": "100/s"},
          "email_reports.send": {
              "rate_limit": "1/s",
              "time_limit": int(timedelta(seconds=600).total_seconds()),
              "soft_time_limit": int(timedelta(seconds=600).total_seconds()),
              "ignore_result": True,
          },
      }
      beat_schedule = {
          "reports.scheduler": {
              "task": "reports.scheduler",
              "schedule": 59.95,
          },
          "reports.prune_log": {
              "task": "reports.prune_log",
              "schedule": crontab(minute=0, hour=0),
          },
          'cache-warmup-hourly': {
                'task': 'cache-warmup',
                'schedule': crontab(minute=26, hour='*'),  # @hourly
                'kwargs': {
                    'strategy_name': 'top_n_dashboards',
                    'top_n': 20,
                    'since': '7 days ago',
                },
            },
      }

    CELERY_CONFIG = CeleryConfig

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.10

Node version

Not applicable

Browser

Not applicable

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@dosubot dosubot bot added the alert-reports Namespace | Anything related to the Alert & Reports feature label Jul 26, 2024
Copy link

dosubot bot commented Jul 26, 2024

Hey there, @padbk! 👋 I'm here to assist with any bugs, questions, or contributions you have regarding the repository. Let's tackle this together!

To resolve the issue with Celery 4.1.0rc1 on Kubernetes where you receive an error about an unregistered task 'reports.scheduler' every minute on the worker node, you need to ensure that the task is properly registered in your Celery configuration. The error indicates that the task 'reports.scheduler' is not recognized by the worker, which suggests that it might not be included in the imports section of your Celery configuration.

Here is an example of how you can configure your Celery settings to include the 'reports.scheduler' task:

configOverrides:
  celery_conf: |
    from celery.schedules import crontab
    class CeleryConfig:
      broker_url = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      imports = (
          "superset.sql_lab",
          "superset.tasks.cache",
          "superset.tasks.scheduler",  # Ensure this line is included
      )
      result_backend = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      task_annotations = {
          "sql_lab.get_sql_results": {
              "rate_limit": "100/s",
          },
      }
      beat_schedule = {
          "reports.scheduler": {
              "task": "reports.scheduler",
              "schedule": crontab(minute="*", hour="*"),
          },
          "reports.prune_log": {
              "task": "reports.prune_log",
              'schedule': crontab(minute=0, hour=0),
          },
          'cache-warmup-hourly': {
              "task": "cache-warmup",
              "schedule": crontab(minute="*/30", hour="*"),
              "kwargs": {
                  "strategy_name": "top_n_dashboards",
                  "top_n": 10,
                  "since": "7 days ago",
              },
          }
      }
    CELERY_CONFIG = CeleryConfig

Make sure that the superset.tasks.scheduler module is correctly specified in the imports section. This should resolve the issue of the unregistered task error. Additionally, ensure that the reports.scheduler task is defined in your codebase as shown below:

from superset.extensions import celery_app

@celery_app.task(name="reports.scheduler")
def scheduler() -> None:
    """
    Celery beat main scheduler for reports
    """
    # Task implementation

This should ensure that the task is recognized and properly registered by the Celery worker [1][2].

To continue talking to Dosu, mention @dosu.

@rusackas
Copy link
Member

Please feel free to mention this on the official vote thread if you'd like to draw more attention to it, and see if others are experiencing the same. The release is currently in testing/voting, so it'd be appreciated :) Let me know if you're not on the developer mailing list and I can provide more instructions.

@padbk
Copy link
Contributor Author

padbk commented Jul 29, 2024

Turns out the imports line I was using was out of date and it needs to be changed. Now mine reads:

imports = ("superset.sql_lab", "superset.tasks.scheduler", "superset.tasks.thumbnails", "superset.tasks.cache", )

Not sure what has changed to make "superset.tasks" no longer work in 4.1.0, but the docs seem up to date.

@padbk padbk closed this as completed Jul 29, 2024
@sfirke
Copy link
Member

sfirke commented Jul 29, 2024

I had the same issue and it went away after making that one-line change you suggest. Thanks! I have been using this same config without issues since 2.0.0, I guess something changed for 4.1.0 that made it finally not work.

@mistercrunch mistercrunch reopened this Aug 5, 2024
mistercrunch added a commit that referenced this issue Aug 5, 2024
closes #29708

About the change in `superset/utils/json.py`, I was somehow getting a pre-commit hook to trigger
@mistercrunch
Copy link
Member

I noticed warnings around this in docker-compose last week, and connected this issue when looking at 4.1 blockers.

I think I should be fixing the root cause here -> #29862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert-reports Namespace | Anything related to the Alert & Reports feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants