Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache node_modules to reduce build time #858

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

adriangohjw
Copy link
Contributor

@adriangohjw adriangohjw commented Nov 5, 2024

Problem

Reduce build time and cost by caching

partially closes https://linear.app/ogp/issue/ISOM-1661/optimise-site-build-times

Solution

Breaking Changes

  • Yes - this PR contains breaking changes
    • Details ...
  • No - this PR is backwards compatible

Improvements:

TL;DR - cache node_modules and save ~2+ mins per build

approach

  • storing cached files (based on release tag version) in S3 as there's no guarantee local cache will persist between builds given the low frequency of builds and the on-demand setup of our codebuild instances
  • since releases are not very often (~1-2 times weekly), this reduces build time for every build across sites
  • e.g. MOH-corp now build at ~14 mins when testing with PROD data

how:

  • cache node_modules (isomer)
    • before: between 80 to 120 seconds to install
    • after: ~6 seconds to fetch from S3
    • if cache miss, takes ~6 seconds (up to 40 seconds) to upload to S3 too after installing
  • cache all the node_modules (tooling/template)
    • before: ~50 (9 + 33 + 8) seconds to install
    • after: ~0 seconds to fetch from S3
    • if cache miss, takes ~1 seconds (up to 17seconds)to upload to S3 too after installing

note

  • did not cache .next/cache because it is too big and does not really speed up next build much. In fact, it's actually slower if we have to download the cache from S3 than actually building it on the spot
  • while ideally we use codebuild in-built cache mechanism in buildspec, but:
    • it's easier to maintain everything in the publisher script
    • cache are tagged to dynamic keys like ISOMER_BUILD_REPO_BRANCH which are fetch on build-time, which can't be done easily since we are using pulumi to define the stack

@adriangohjw adriangohjw added the enhancement New feature or request label Nov 5, 2024
@adriangohjw adriangohjw self-assigned this Nov 5, 2024
Copy link

vercel bot commented Nov 5, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
isomer-studio ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 5, 2024 5:09pm

@datadog-opengovsg
Copy link

datadog-opengovsg bot commented Nov 5, 2024

Datadog Report

Branch report: build-cache-next
Commit report: eb513b4
Test service: isomer-studio

✅ 0 Failed, 167 Passed, 34 Skipped, 47.17s Total Time
➡️ Test Sessions change in coverage: 1 no change

@adriangohjw
Copy link
Contributor Author

adriangohjw commented Nov 5, 2024

note - to be merged after closing https://github.com/opengovsg/isomer-next-infra/pull/27 and pulumi up

@adriangohjw adriangohjw changed the title cache node_modules and .next/cache to reduce build time cache node_modules to reduce build time Nov 5, 2024
Copy link

linear bot commented Nov 7, 2024

Comment on lines +41 to +43
echo "Fetching cached node_modules..."
NODE_MODULES_CACHE_PATH="s3://$S3_CACHE_BUCKET_NAME/$ISOMER_BUILD_REPO_BRANCH/isomer/node_modules.tar.gz"
aws s3 cp $NODE_MODULES_CACHE_PATH node_modules.tar.gz || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I think this will break our assumption that it takes at least 1 minute to reach the "fetching from database" step. Should we move that step further down after the "prebuilding" step and before the "building" step instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcshzj thanks for catching this!

But moving it between prebuilding and building wouldn't help too since prebuilding is cached just like node_modules

With this PR, if cache is found (which should be all builds except the first one after we publish a newer package version), then the time to reach "fetching from database" right after the 1) queuing of build + 2) provisioning of instances. A safe min. time needed from observing past build is 20s (provisioning of instances)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants