Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up builds by fixing github action cache and cargo build cache #332

Merged
merged 1 commit into from
Feb 4, 2022

Conversation

epgts
Copy link
Contributor

@epgts epgts commented Jan 27, 2022

  • Changes

** Add comments throughout ci.yml explaining how it works.

** Correct path in CARGO_HOME cache.

We were attempting to cache two non-existent directories
/home/postgres/.cargo (~/.cargo). Our actual CARGO_HOME is
/usr/local/cargo, so instead we'll cache those.

** Comment out restore-keys and leave TODO for the plan.

It's not clear why we had so many. The github action cache is write
ONCE, which means over time it becomes colder and colder. We need to
increment the cache keys periodically to warm it back up. When we do
that, we can avoid dropping all the way down to cold cache performance
by moving the previous cacche keys to restore-keys, but it shouldn't
make sense for restore-keys to list more than one previous key.

** Move all commands to new tools/build script.

This both reduces duplication across all the blocks in ci.yml as well as
allowing us to build and test the same way locally.

** Replace sudo with complex and unnecessary flags with su.

Before:
sudo -HEsu postgres sh -c "..."

  • -E - preserve environment - su does this already
  • -H - force HOME to target user's home directory - su does this already
  • -s - run command through shell rather than directly. And then we ran
    ANOTHER shell under that. Both were unnecessary.
  • -u postgres - have to specify the user

After:
su postgres -c ...'

** Specify PGVERSION in each job's environment to reduce duplication.

** Run crates tests only once and separate from pgx tests.

These tests do not depend on postgresql so running them more than once
buys us nothing.

Worse, not running the pgx tests under the extension subdirectory causes
pgx to thrash cargo's build cache and build the world twice. We don't
fully understand this, but document what we do know in tools/build.

** TODO Add questions about many of the environment variables.

** TODO Consider running post-install / doc tests separately.

  • Timings

Cold cache: 18m20s
https://github.com/timescale/timescaledb-toolkit/runs/5007577229?check_suite_focus=true
Warm cache: 9m14s
https://github.com/timescale/timescaledb-toolkit/runs/5007862209?check_suite_focus=true

@epgts epgts force-pushed the eg/ci-improvements branch 30 times, most recently from 4deef2a to 3c43c8f Compare January 28, 2022 21:38
@epgts epgts force-pushed the eg/ci-improvements branch 13 times, most recently from a76336d to db08695 Compare January 31, 2022 16:58
@epgts epgts changed the title DO NOT MERGE Speed up builds by fixing github action cache and cargo build cache Jan 31, 2022
@epgts epgts requested review from JLockerman and WireBaron January 31, 2022 17:07
@epgts epgts force-pushed the eg/ci-improvements branch from db08695 to e683586 Compare January 31, 2022 17:36
Comment on lines +116 to +125
# TODO post-install + doc test adds a good 90s. We could probably put
# them together run serially in a separate job to shave that 90s off the
# long pole.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-install is testing update scripts, which should be safe to just do once. doc tests are mostly testing continuous aggregates, which I think we want to do on every version 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked into the post-install scripts yet. Are you saying they're the same as "update tests"? If so, I misunderstood the conversation yesterday and may have written some incorrect details in #335. In any case, can you add some more details about the post-install, update, and upgrade tests to #335?

Comment on lines +55 to +56
# CARGO_TARGET_DIR_NAME allows me to call my target directory '.target'.
# TODO Can we simplify and just force .target here? It works fine, but I didn't want to surprise people...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean ./target? If so, I think that should be safe to use. If we really wanted to be principled, I guess we could try to extract target_directory it from cargo metadata 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(users can override the default target directory using a .cargo/config.toml)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean .target. I always prefix my build directories with . so I don't trip over them all the time.

I experimented with the setting in ~/.cargo/config.toml and it doesn't seem to be very useful: it overrides for ALL projects and JUST ONE DIRECTORY and resolves it relative to the home directory! Setting build.target-dir = "target" causes /home/epg/target to be used for all Rust builds which works about as well as you'd expect :)

@epgts epgts force-pushed the eg/ci-improvements branch from 04e2ea1 to 8253bd3 Compare February 4, 2022 18:27
* Changes

** Add comments throughout ci.yml explaining how it works.

** Correct path in CARGO_HOME cache.

We were attempting to cache two non-existent directories
`/home/postgres/.cargo` (`~/.cargo`).  Our actual CARGO_HOME is
`/usr/local/cargo`, so instead we'll cache those.

** Comment out restore-keys and leave TODO for the plan.

It's not clear why we had so many.  The github action cache is write
ONCE, which means over time it becomes colder and colder.  We need to
increment the cache keys periodically to warm it back up.  When we do
that, we can avoid dropping all the way down to cold cache performance
by moving the previous cacche keys to restore-keys, but it shouldn't
make sense for restore-keys to list more than one previous key.

** Move all commands to new tools/build script.

This both reduces duplication across all the blocks in ci.yml as well as
allowing us to build and test the same way locally.

** Replace sudo with complex and unnecessary flags with su.

Before:
sudo -HEsu postgres sh -c "..."
- -E - preserve environment - su does this already
- -H - force HOME to target user's home directory - su does this already
- -s - run command through shell rather than directly.  And then we ran
  ANOTHER shell under that.  Both were unnecessary.
- -u postgres - have to specify the user

After:
su postgres -c ...'

** Specify PGVERSION in each job's environment to reduce duplication.

** Run crates tests only once and separate from pgx tests.

These tests do not depend on postgresql so running them more than once
buys us nothing.

Worse, not running the pgx tests under the extension subdirectory causes
pgx to thrash cargo's build cache and build the world twice.  We don't
fully understand this, but document what we do know in tools/build.

** TODO Add questions about many of the environment variables.

** TODO Consider running post-install / doc tests separately.

* Timings

Cold cache: 18m20s
https://github.com/timescale/timescaledb-toolkit/runs/5007577229?check_suite_focus=true
Warm cache:  9m14s
https://github.com/timescale/timescaledb-toolkit/runs/5007862209?check_suite_focus=true
@epgts epgts force-pushed the eg/ci-improvements branch from 8253bd3 to df93251 Compare February 4, 2022 21:09
@epgts epgts marked this pull request as ready for review February 4, 2022 21:24
@epgts
Copy link
Contributor Author

epgts commented Feb 4, 2022

bors r+

@bors
Copy link
Contributor

bors bot commented Feb 4, 2022

Build succeeded:

@bors bors bot merged commit 36ea1a4 into main Feb 4, 2022
@bors bors bot deleted the eg/ci-improvements branch February 4, 2022 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants