[Bug]: Job data not passed to worker, all queue jobs removed and excessive amount of requests sent to redis server #2763

samundrak · 2024-09-07T10:07:29Z

Version

v3.10.3

Platform

NodeJS

What happened?

I have been facing this issue for a week now. A few changes I made included increasing the delay and adding a limiter. After these changes, I started encountering issues (Not sure if this is the reason), such as the job data not being passed to the worker—it was basically empty. To fix this temporarily, I removed a few queues, but another problem arose: all the queue data was suddenly being removed, which is causing serious issues in production.

I couldn't pinpoint the problem, so I switched from AWS ElastiCache to a self-hosted Redis to ensure the settings were configured correctly according to the documentation. It worked well for a few days, but then the issue of the queue being automatically removed started again. I did some debugging, checked the logs using RedisInsight, and discovered that an excessive number of requests were being sent to the Redis server.

Framework: NestJS ^9.0.0

Screen.Recording.2024-09-07.at.19.03.10.mov

How to reproduce.

Not able to reproduce it on dev environment or in local environment.

Relevant log output

No response

Code of Conduct

I agree to follow this project's Code of Conduct

manast · 2024-09-07T11:22:06Z

The number of requests are probably just normal.
Seems like you having several issues and you are conflating them which makes it more difficult to solve them. I suggest you take every issue as separate things. For example, jobs with empty data, that would be one thing, try to isolate the problem, most likely this issue is in your own code, set debug logs an try to figure out if you really are setting the data before adding the jobs.
All queue data removed also sounds like you have some code that is removing the queues, maybe some test/debug leftovers code, or maybe you are not configuring the maxmemory policy of your queue appropriately, although this normally would not result in all data removed.

samundrak · 2024-09-07T11:45:37Z

@manast
Thank you for the response.

So far, I don't have any explicit code to remove items from the queue, and the only queue removal setting is the default one, which is done after completion or failure by Bull. The job data being removed might have been resolved after I refactored my code from the NestJS process decorator to an explicit worker class, but the issue of queue data getting removed remains frequent. We can't add any items to the queue, as they are removed immediately. I thought the excessive number of requests could also be the cause, as it was sending many DEL commands to the Redis server when I used throttle/limiter.
Additionally, the amount of requests seems normal when I check my development environment. However, even when there's no load, the number of requests sent to Redis in production is still quite high.

// Request sent to Redis when throttled

// Worker Implementation

// Queue settings

    queue: {
      removeOnComplete: {
        age: 3600 * 12, // keep up to 1 hour
        count: 1000, // keep up to 1000 jobs
      },
      removeOnFail: {
        age: 48 * 3600, // keep up to 48 hours
      },
      delay: 5000,
      attempts: 3,
      backoff: {
        type: 'exponential',
        delay: 60000,
      },
    },

manast · 2024-09-08T09:31:18Z

I am quite confident the issue with missing data is not a bug in BullMQ.

roggervalf · 2024-09-08T14:40:07Z

Hi @samundrak, you are using quite old version, could you pls try to use the latest one and let us know

samundrak · 2024-09-09T02:11:26Z

Thank you for the response

@manast I was thinking same for most of the the time, but I couldn't find place in the implementation that could be the reason for all queue data being removed. One thing I did recently was to remove the throttle in worker settings and since then it hasn't occurred but am still giving some time to confirm.

@roggervalf
I am not sure about the old version being an issue but I think now I should work on updating it.

I will update you on the status once I update the versions.
Thank you for the help

samundrak added the bug Something isn't working label Sep 7, 2024

manast removed the bug Something isn't working label Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Job data not passed to worker, all queue jobs removed and excessive amount of requests sent to redis server #2763

[Bug]: Job data not passed to worker, all queue jobs removed and excessive amount of requests sent to redis server #2763

samundrak commented Sep 7, 2024

manast commented Sep 7, 2024

samundrak commented Sep 7, 2024 •

edited

Loading

manast commented Sep 8, 2024

roggervalf commented Sep 8, 2024

samundrak commented Sep 9, 2024

[Bug]: Job data not passed to worker, all queue jobs removed and excessive amount of requests sent to redis server #2763

[Bug]: Job data not passed to worker, all queue jobs removed and excessive amount of requests sent to redis server #2763

Comments

samundrak commented Sep 7, 2024

Version

Platform

What happened?

How to reproduce.

Relevant log output

Code of Conduct

manast commented Sep 7, 2024

samundrak commented Sep 7, 2024 • edited Loading

manast commented Sep 8, 2024

roggervalf commented Sep 8, 2024

samundrak commented Sep 9, 2024

samundrak commented Sep 7, 2024 •

edited

Loading