-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow custom job ids to be specified in the job options #335
Allow custom job ids to be specified in the job options #335
Conversation
@@ -56,6 +56,13 @@ describe('Job', function(){ | |||
expect(storedJob.opts.testOpt).to.be('enabled'); | |||
}); | |||
}); | |||
|
|||
it('should use the custom jobId if one is provided', function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case suggestion:
Create a job with a customId and test if it can be processed, in the process
callback, verify that the custom jobId matches the one provided
Add a bit to the readme too |
I have mixed feelings about this functionality. From one side it seems like something that should be nice to give to certain pro-users, but on the other hand, it may also create a new flora of issues if the user is not able to generate unique ids for every job. I haven't understood yet which consequences it would have to add several jobs using the same jobId... Isn't there a different way to achieve what you want without custom ids? |
@manast The general use case is having a concurrency-safe way to add a job if it doesn't already exist, based on some unique value (in our case the user id). In our case, we're dealing with hundreds of thousands of jobs per hour, and the load on Redis has been pretty intense. We needed a way to avoid adding a job to the queue if one already exists for the user. I've deployed our branch running this code to production and the results have been great - we've gone from ~90% CPU utilisation to < 10%. In terms of other possible ways to achieve the same thing: we could do a transaction script which loops through jobs looking for a match, and inserts it it doesn't find one, but that's slow. We could also keep a secondary key as a kind of lock to indicate whether a job with that unique value already exists, but that introduces other issues: what happens if some failure occurs and the lock is left orphaned? We'd need some kind of TTL, and it all gets rather complicated. Being able to override the key seemed the best solution to me. It lets us determine whether a job with that unique value already exists in constant-time, and doesn't have any of the issues around locking. |
I am not sure I understand your use case completely, but what about this: when you call the addJob method you get the unique ID for that given job. If you stored that ID in a redis database using the userId as key, wouldn't you then be able to test in O(1) if the users' job exists already? You will of course need some code to update the database when jobs are completed/failed, removed and so on. |
@manast Unfortunately, that wouldn't be concurrency safe. We'd need to:
Another thread could create or remove the job in between 1 and 2. We'd have the same problem when removing the lock at the end of the job too, plus the possibility that the job processes but some failure interferes with removing the lock, hence preventing all future jobs from processing for that user, requiring us to build a more complex system of TTLs etc. |
ok. But how do you avoid points 1 and 2 using custom ids in bull? |
@manast We wrap 1 and 2 in a transaction. Because we have a known key, we can perform step 1 using an (Note: I've just realised we'll need to execute these within a |
why do you need multi? If the jobId exists already it just returns false or -1 for example, and since it is run in a script it will be atomic. |
@manast Oh, are scripts atomic by default? I thought they needed to be wrapped in a |
}) | ||
|
||
var script = [ | ||
'local jobId = redis.call("INCR", KEYS[5])', | ||
'local jobId = ARGV[2] == "" and redis.call("INCR", KEYS[5]) or ARGV[2]', | ||
'redis.call("HMSET", ARGV[1] .. jobId' + argvs.join('') + ')', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use if/else here for readability ?
@manast Cool, good suggestions. I have implemented them. |
@pricj004 @manast I will be using this feature to make sure duplicate jobs don't get added to my queue. When submitting a new job with the same ID as a previously FAILED job, the new job is not added to the queue. Is this desired behavior for most? I would really like it to retry/rerun the failed job if I submit it again. Can someone explain why the current behavior was chosen? |
@TomKaltz couldn't you just use the retry functionality? or listen to the fail event and delete the job in that case? |
@manast your suggestion would work I guess. I just think it would be a good feature for duplicate submitted jobId to automatically retry if it had previously failed. Currently job is ignored if ID exists and has previously failed. I see more use-case for my feature request than current functionality of ignoring the job in that state. Is there any reason why bull can't adopt this functionality? |
@TomKaltz mostly that it requires some extra work and I guess nobody has time to do it right now... |
No description provided.