Description
First off, thanks for this Gem. We've been running it under a light workload in production for a few months now, and it's been terrific.
Second, a question, possibly looking for advise or to help troubleshoot. Our configuration/stack looks like the following
What | Version |
---|---|
ruby | ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux-musl] |
rails | 6.0.3.4 |
good_job | 1.3.4 |
postgresql | PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit |
worker | bundle exec good_job start #1 process, 5 threads |
preserve_job_records | GoodJob.preserve_job_records = true |
connection pool | pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %> #RAILS_MAX_THREADS is set to 5 as well |
Yesterday we began processing a larger volume of jobs (~15K/hour), and a few hours after we started doing that, began to notice the following exception
ActiveRecord::StatementInvalid: PG::OutOfMemory: ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
We bumped the max_locks_per_transactions
to 1024 (from the default of 64) to buy us some time to troubleshoot.
Question : It appears GoodJob seems to keep advisory locks around for finished jobs. Is that expected? I would think that besides just the storage overhead, preserve_job_records
wouldn't need to hold on to advisory locks.
Appendix
In Rails :
irb(main):001:0> GoodJob::Job.finished.first.advisory_locked?
=> true
irb(main):002:0> GoodJob::Job.finished.advisory_locked.count
=> 247742 #This query took about 3-4 seconds to perform:
Number of locks
database=> select locktype, count(locktype) from pg_locks group by locktype;
locktype | count
---------------+--------
virtualxid | 5
object | 1
advisory | 255268
transactionid | 1
relation | 23
(5 rows)
Number of connections
database => select pid as process_id,
usename as username,
datname as database_name,
client_addr as client_address,
application_name,
backend_start,
state,
state_change
from pg_stat_activity;
process_id | username | database_name | client_address | application_name | backend_start | state | state_change
------------+-------------------+-------------------+----------------+--------------------------------+-------------------------------+---------------------+-------------------------------
26497 | | | | | 2020-12-15 02:21:54.315069+00 | |
26500 | redacted | | | | 2020-12-15 02:21:54.315814+00 | |
26501 | redacted_redacted | redacted_redacted | 192.168.29.124 | /usr/local/bundle/bin/good_job | 2020-12-15 02:21:54.320826+00 | idle | 2020-12-15 02:21:54.394067+00
26502 | redacted_redacted | redacted_redacted | 192.168.29.124 | GoodJob::Notifier | 2020-12-15 02:21:54.401358+00 | idle | 2020-12-15 02:21:54.423085+00
26789 | redacted_redacted | redacted_redacted | 192.168.29.124 | /usr/local/bundle/bin/good_job | 2020-12-15 02:21:55.35472+00 | active | 2020-12-15 18:03:24.603505+00
26803 | redacted_redacted | redacted_redacted | 192.168.29.124 | /usr/local/bundle/bin/good_job | 2020-12-15 02:21:55.547899+00 | active | 2020-12-15 18:03:24.476213+00
26804 | redacted_redacted | redacted_redacted | 192.168.29.124 | /usr/local/bundle/bin/good_job | 2020-12-15 02:21:55.85596+00 | idle in transaction | 2020-12-15 18:03:24.663866+00
26912 | redacted | redacted | 127.0.0.1 | PostgreSQL JDBC Driver | 2020-12-15 02:22:05.433505+00 | idle | 2020-12-15 18:03:19.285195+00
27044 | redacted | redacted | 127.0.0.1 | | 2020-12-15 02:22:07.521112+00 | idle | 2020-12-15 18:03:24.342632+00
5033 | redacted_redacted | redacted_redacted | 192.168.48.39 | psql | 2020-12-15 18:02:47.951493+00 | active | 2020-12-15 18:03:24.623074+00
30076 | redacted_redacted | redacted_redacted | 192.168.29.124 | /usr/local/bundle/bin/good_job | 2020-12-15 02:25:33.759964+00 | active | 2020-12-15 18:03:24.587388+00
26495 | | | | | 2020-12-15 02:21:54.316586+00 | |
26494 | | | | | 2020-12-15 02:21:54.314331+00 | |
26496 | | | | | 2020-12-15 02:21:54.314747+00 | |
(14 rows)
Activity