Skip to content

Update swirl #2269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 22, 2020
Merged

Update swirl #2269

merged 2 commits into from
Mar 22, 2020

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Mar 10, 2020

This removes some of the boilerplate we had to add last night for
manually managing connection pools with swirl. Swirl now can
automatically manage its own connection pool. It can be given either a
database URL, a pool builder, or a fully built pool (which is used in
tests). If it's given one of the first two, it will configure the
connection pool to have a size of 2x the thread count. I've also let it
go up to the default number of threads, which is the number of CPU
cores. Since this would mean the pool size grows pretty dramatically,
I've configured it to not try to keep any idle connections (meaning it
will try to establish the connection when asked for one, not before).

Additionally, we no longer need the connection pool to live on our
environment, we can take either a connection or a pool as an argument to
a background job, and swirl will automatically take it from the pool. In
the future I'm going to try and have swirl make some more intelligent
decisions on this, so we could do something like have 2 threads only run
jobs that don't need a connection (like publish!)

Heroku doesn't guarantee the number of logical cores on standard dynos,
but it appears to have consistently been 4 for a while. This means we'll
be doubling the number of background workers, and potentially using up
to 8 database connections on worker. Now that we've moved some web
traffic to the replica, we may want to lower the pool size for the
primary db on the web servers to compensate.

This removes some of the boilerplate we had to add last night for
manually managing connection pools with swirl. Swirl now can
automatically manage its own connection pool. It can be given either a
database URL, a pool builder, or a fully built pool (which is used in
tests). If it's given one of the first two, it will configure the
connection pool to have a size of 2x the thread count. I've also let it
go up to the default number of threads, which is the number of CPU
cores. Since this would mean the pool size grows pretty dramatically,
I've configured it to not try to keep any idle connections (meaning it
will try to establish the connection when asked for one, not before).

Additionally, we no longer need the connection pool to live on our
environment, we can take either a connection or a pool as an argument to
a background job, and swirl will automatically take it from the pool. In
the future I'm going to try and have swirl make some more intelligent
decisions on this, so we could do something like have 2 threads only run
jobs that don't need a connection (like publish!)

Heroku doesn't guarantee the number of logical cores on standard dynos,
but it appears to have consistently been 4 for a while. This means we'll
be doubling the number of background workers, and potentially using up
to 8 database connections on worker. Now that we've moved some web
traffic to the replica, we may want to lower the pool size for the
primary db on the web servers to compensate.
@rust-highfive
Copy link

r? @smarnach

(rust_highfive has picked a reviewer for you, use r? to override)

@jtgeibel
Copy link
Member

LGTM

@bors r+

@bors
Copy link
Contributor

bors commented Mar 22, 2020

📌 Commit 93c46d3 has been approved by jtgeibel

@bors
Copy link
Contributor

bors commented Mar 22, 2020

⌛ Testing commit 93c46d3 with merge 1f91212...

bors added a commit that referenced this pull request Mar 22, 2020
Update swirl

This removes some of the boilerplate we had to add last night for
manually managing connection pools with swirl. Swirl now can
automatically manage its own connection pool. It can be given either a
database URL, a pool builder, or a fully built pool (which is used in
tests). If it's given one of the first two, it will configure the
connection pool to have a size of 2x the thread count. I've also let it
go up to the default number of threads, which is the number of CPU
cores. Since this would mean the pool size grows pretty dramatically,
I've configured it to not try to keep any idle connections (meaning it
will try to establish the connection when asked for one, not before).

Additionally, we no longer need the connection pool to live on our
environment, we can take either a connection or a pool as an argument to
a background job, and swirl will automatically take it from the pool. In
the future I'm going to try and have swirl make some more intelligent
decisions on this, so we could do something like have 2 threads only run
jobs that don't need a connection (like publish!)

Heroku doesn't guarantee the number of logical cores on standard dynos,
but it appears to have consistently been 4 for a while. This means we'll
be doubling the number of background workers, and potentially using up
to 8 database connections on worker. Now that we've moved some web
traffic to the replica, we may want to lower the pool size for the
primary db on the web servers to compensate.
@jtgeibel
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Mar 22, 2020

📌 Commit 893c5e3 has been approved by jtgeibel

@bors
Copy link
Contributor

bors commented Mar 22, 2020

⌛ Testing commit 893c5e3 with merge b63607e...

@bors
Copy link
Contributor

bors commented Mar 22, 2020

☀️ Test successful - checks-travis
Approved by: jtgeibel
Pushing b63607e to master...

@bors bors merged commit b63607e into master Mar 22, 2020
jtgeibel added a commit to jtgeibel/crates.io that referenced this pull request Mar 26, 2020
…-03-10, r=jtgeibel"

This reverts commit b63607e, reversing
changes made to dc18552.
@jtgeibel jtgeibel mentioned this pull request Mar 26, 2020
bors added a commit that referenced this pull request Mar 26, 2020
Revert #2269

This is to make master deployable.  The original PR caused issues with running background jobs.

This reverts commit b63607e, reversing changes made to dc18552.

r? @ghost
cc @sgrif #2269
@Turbo87 Turbo87 deleted the sg-update-swirl-2020-03-10 branch April 1, 2020 19:40
sgrif added a commit to sgrif/crates.io that referenced this pull request Apr 16, 2020
This is a second attempt at rust-lang#2269, which had to be reverted shortly
after being deployed, as it caused jobs to fail to run in production.
This commit only differs from that PR in the revision of swirl used.

The cause of the issue was the change in the signature of
`update_downloads`, which does not need anything from the job
environment, and so it only takes a connection. This changed the
environment type for that job from `background_jobs::Environment` to
`()`, so our runner no longer knew how to run that job.

The smallest fix in crates.io would have been to add an unused
environment argument to `update_downloads`. However, having some jobs
not require a shared environment felt common enough that I fixed this in
swirl instead. sgrif/swirl#23 changed the
behavior so jobs with no environment (a.k.a. the environment type is
`()`) are always run, regardless of the environment type of the runner.

Since `update_downloads` is essentially a cron job, and is not enqueued
from the web server directly, this was not caught by our integration
suite. This case was caught quickly after being deployed, and only
affected a non-critical part of the service, so I've opted not to figure
out how to add this to our integration suite.
@sgrif sgrif mentioned this pull request Apr 16, 2020
bors added a commit that referenced this pull request May 2, 2020
Update Swirl, take two

This is a second attempt at #2269, which had to be reverted shortly
after being deployed, as it caused jobs to fail to run in production.
This commit only differs from that PR in the revision of swirl used.

The cause of the issue was the change in the signature of
`update_downloads`, which does not need anything from the job
environment, and so it only takes a connection. This changed the
environment type for that job from `background_jobs::Environment` to
`()`, so our runner no longer knew how to run that job.

The smallest fix in crates.io would have been to add an unused
environment argument to `update_downloads`. However, having some jobs
not require a shared environment felt common enough that I fixed this in
swirl instead. sgrif/swirl#23 changed the
behavior so jobs with no environment (a.k.a. the environment type is
`()`) are always run, regardless of the environment type of the runner.

Since `update_downloads` is essentially a cron job, and is not enqueued
from the web server directly, this was not caught by our integration
suite. This case was caught quickly after being deployed, and only
affected a non-critical part of the service, so I've opted not to figure
out how to add this to our integration suite.

r? @jtgeibel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants