Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.github/workflows: purge Cloudflare cache on deployment #1893

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

ttaylorr
Copy link
Member

@ttaylorr ttaylorr commented Oct 1, 2024

We use Cloudflare in front of the git-scm.com website in order to cache traffic that would otherwise be going directly to GitHub Pages. However, this cache needs to be purged whenever the actual site is updated, otherwise clients will have to wait out Cloudflare's and their own browser's cache TTL until they see the new content.

Usually this is just fine: for high priority deployments where the new content should be reflected immediately, someone can manually purge the cache by logging into Cloudflare and clicking on the "Purge everything" button.

But sometimes folks aren't around to log into Cloudflare, don't have permissions to, etc. Let's remove the need to have anyone log into Cloudflare and instead purge the cache on deployment so that the site is more up-to-date more quickly upon deployment.

(Note that this only purges the cache for git-scm.com, not git-scm.org. The former receives the vast majority of traffic, and when purging manually I almost always skipped git-scm.org without complaints. Let's avoid having two API keys and instead let git-scm.org lag slightly behind git-scm.com after a deployment).

To generate the API key, I logged into Cloudflare and created an API token1 that has permissions limited to "Zone.Cache Purge" for all zones in the Git project's Cloudflare account. The token is stored as a repository secret2.

/cc @pedrorijo91 @dscho

We use Cloudflare in front of the git-scm.com website in order to cache
traffic that would otherwise be going directly to GitHub Pages. However,
this cache needs to be purged whenever the actual site is updated,
otherwise clients will have to wait out Cloudflare's and their own
browser's cache TTL until they see the new content.

Usually this is just fine: for high priority deployments where the new
content should be reflected immediately, someone can manually purge the
cache by logging into Cloudflare and clicking on the "Purge everything"
button.

But sometimes folks aren't around to log into Cloudflare, don't have
permissions to, etc. Let's remove the need to have anyone log into
Cloudflare and instead purge the cache on deployment so that the site is
more up-to-date more quickly upon deployment.

(Note that this only purges the cache for git-scm.com, not git-scm.org.
The former receives the vast majority of traffic, and when purging
manually I almost always skipped git-scm.org without complaints. Let's
avoid having two API keys and instead let git-scm.org lag slightly
behind git-scm.com after a deployment).

To generate the API key, I logged into Cloudflare and created an API
token[1] that has permissions limited to "Zone.Cache Purge" for all
zones in the Git project's Cloudflare account. The token is stored as a
repository secret[2].

[1]: https://developers.cloudflare.com/fundamentals/api/get-started/create-token/
[2]: https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions

Signed-off-by: Taylor Blau <me@ttaylorr.com>
@ttaylorr ttaylorr merged commit a99bcaf into gh-pages Oct 2, 2024
1 of 2 checks passed
@ttaylorr ttaylorr deleted the ttaylorr/purge-cloudflare-cache branch October 2, 2024 13:14
uses: jakejarvis/cloudflare-purge-action@v0.3.0
env:
CLOUDFLARE_ZONE: ${{ secrets.CLOUDFLARE_ZONE }}
CLOUDFLARE_TOKEN: ${{ secrets.CLOUDFLARE_TOKEN }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, couldn't review this earlier: can we also add an if: secrest.CLOUDFLARE_TOKEN != '' so that deployments in forks don't fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, I pushed this out here: #1895.

ttaylorr added a commit that referenced this pull request Oct 2, 2024
The recent change in 0d22411 (.github/workflows: purge Cloudflare
cache on deployment, 2024-10-01) will cause deployment workflows run in
forks of git/git-scm.com to fail, since they will likely not have a
Cloudflare token configured in their repository secrets.

Let's make use of a suggestion from Johannes[1] and only run this step
when the repository is configured with a Cloudflare token.

Note that we can't directly write:

    if: ${{ secrets.CLOUDFLARE_TOKEN != '' }}

, because repository secrets cannot be used in job step-level
conditionals[2]. Instead, the GitHub Actions documentation recommends
setting the secret as an environment variable, and then using the
presence (or absence) of that environment variable's contents as a proxy
to determine whether or not the secret is set.

All together, this should un-break deployment workflows in forks of
git/git-scm.com for repositories that have not set up a Cloudflare
token.

[1]: #1893 (review)
[2]: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#example-using-secrets

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
ttaylorr added a commit that referenced this pull request Oct 2, 2024
The recent change in 0d22411 (.github/workflows: purge Cloudflare
cache on deployment, 2024-10-01) will cause deployment workflows run in
forks of git/git-scm.com to fail, since they will likely not have a
Cloudflare token configured in their repository secrets.

Let's make use of a suggestion from Johannes[1] and only run this step
when the repository is configured with a Cloudflare token.

Note that we can't directly write:

    if: ${{ secrets.CLOUDFLARE_TOKEN != '' }}

, because repository secrets cannot be used in job step-level
conditionals[2]. Instead, the GitHub Actions documentation recommends
setting the secret as an environment variable, and then using the
presence (or absence) of that environment variable's contents as a proxy
to determine whether or not the secret is set.

All together, this should un-break deployment workflows in forks of
git/git-scm.com for repositories that have not set up a Cloudflare
token.

[1]: #1893 (review)
[2]: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#example-using-secrets

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
@dscho
Copy link
Member

dscho commented Oct 4, 2024

Hmm. I just noticed that this adds quite a hefty time penalty to the deploy workflow. In this run, it had to build a Docker container, which took 31 seconds, and these 31 seconds blocked the start of even the first step (checking out the repository).

Compare that to the time it took to actually check out, build and deploy the site. I would have loved to link to the corresponding locations in the log, but GitHub Actions' web UI is not responding here after loading but the first page of the log, so I have to give you timestamps:

  • 2024-10-03T06:28:18.5603980Z Syncing repository: git/git-scm.com (10 seconds)
  • 2024-10-03T06:28:28.0731391Z ##[group]Run ./.github/actions/deploy-to-github-pages (total: 5 minutes and 8 seconds)
    • 2024-10-03T06:29:09.9429502Z ##[group]Run hugo config && hugo --baseURL "http://git-scm.com/" (33 seconds)
    • 2024-10-03T06:29:43.2388513Z ##[group]Run npx -y pagefind@1.1.1 --site public (2 minutes and 48 seconds)
    • 2024-10-03T06:32:31.0520083Z [work-around for transitioning from Rails to Hugo, could be dropped now] (<1 second)
    • 2024-10-03T06:32:31.3970176Z ##[group]Run echo ::group::Archive artifact (3 seconds, but tons of log output)
    • 2024-10-03T06:32:34.9699271Z ##[group]Run actions/upload-artifact@v4 (20 seconds)
    • 2024-10-03T06:32:55.0582356Z Fetching artifact metadata for "github-pages" in this workflow run (this is basically waiting for the remote site to become active, 41 seconds)
    • 2024-10-03T06:33:36.5116198Z Reported success!

In total, the deploy to GitHub Pages Composite Action took 6 minutes and 29 seconds. Note that this is an atypically long time, typically it takes around 4½ minutes (e.g. here). So adding half a minute does make a bit of a difference.

You will note that I tried to be conscious about the need for speed when e.g. a new Git version comes out, and therefore the deployment is performed before the checks are done for broken links, search results etc. The intention is to keep the time window small between the start of the workflow run (which might have been triggered manually) and the time when things are available on the site. That makes the difference between above-mentioned two deployment runs even more stark: in the fast case, the site was deployed within 4 minutes and 6 seconds including the checkout, quite a bit faster than 5 minutes and 8 seconds that exclude the checkout and the Docker container building.

Now, this here PR only targets the deploy workflow run, which is typically only run when merging a PR, not when the (typically scheduled) workflow runs check for a new Git version, for updates of the ProGit book or its translations, or the newest pre-built Git version for Windows/macOS.

That half minute may be acceptable in the scenarios where we merge PRs, but I don't think I'd like to pay that penalty for deployments in general.

At the same time I have to wonder about two things:

  1. do we not want the same cache-flushing actually in the deploy to GitHub Pages Composite Action, so that the caches are flushed also when the site is deployed in response to, say, a new Git version?
  2. could we avoid building a Docker container at all and then spinning it up later on, when we could run a curl command instead?

ttaylorr added a commit that referenced this pull request Oct 4, 2024
As pointed out in [1], using the cloudflare-purge-action incurs a ~31
second penalty at the start of the "deploy" action, where time is spent
building a Docker container to run the action.

This is unnecessary, since Cloudflare has a straightforward REST API
that we can use cURL to communicate with directly, without the extra
start-up cost.

Let's do that instead, and move this to run in the
deploy-to-github-pages action, which is run from multiple entry points,
all of which will want to purge the Cloudflare caches upon deployment.

[1]: #1893 (comment)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
@ttaylorr
Copy link
Member Author

ttaylorr commented Oct 4, 2024

In this run, it had to build a Docker container, which took 31 seconds, and these 31 seconds blocked the start of even the first step (checking out the repository).

Yuck, this is definitely sub-optimal. I think that even if we're 31 seconds slower than otherwise, users are probably still seeing the content faster than the would without this patch, since Cloudflare's own caches likely have a TTL which is longer than 31 seconds.

But I dislike the idea that we're spending that much CPU time to build a Docker container which could be replaced with a simple cURL invocation. So I opened up #1896, which should implement your helpful suggestion. Thank you! 🙇‍♂️

ttaylorr added a commit that referenced this pull request Oct 4, 2024
As pointed out in [1], using the cloudflare-purge-action incurs a ~31
second penalty at the start of the "deploy" action, where time is spent
building a Docker container to run the action.

This is unnecessary, since Cloudflare has a straightforward REST API
that we can use cURL to communicate with directly, without the extra
start-up cost.

Let's do that instead, and move this to run in the
deploy-to-github-pages action, which is run from multiple entry points,
all of which will want to purge the Cloudflare caches upon deployment.

[1]: #1893 (comment)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
ttaylorr added a commit that referenced this pull request Oct 4, 2024
As pointed out in [1], using the cloudflare-purge-action incurs a ~31
second penalty at the start of the "deploy" action, where time is spent
building a Docker container to run the action.

This is unnecessary, since Cloudflare has a straightforward REST API
that we can use cURL to communicate with directly, without the extra
start-up cost.

Let's do that instead, and move this to run in the
deploy-to-github-pages action, which is run from multiple entry points,
all of which will want to purge the Cloudflare caches upon deployment.

[1]: #1893 (comment)

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants