Addition of generic pre build and post build hooks.

**Describe the feature request**
The ability to call shell scripts just before and just after each package is built.

At IOG, our use case for these is for build caching, particularly in CI. Since these pre and post hooks are just shell scripts, there are probably other uses for them.

**Additional context**
At IOG we have a number of very large Haskell projects with deep dependency trees, that can take a long time to build in CI.

The obvious answer to long build times is caching of build products. A previous attempt at this caching [was made](https://hackage.haskell.org/package/cabal-cache), but that solution was not really very satisfactory, because the cache was keyed on `${CPU}-${OS}-${GHC_VERSION)-${hash-of-dependencies}`. The first three are obvious. The problem is `hash-of-dependencies`. If a single high level dependency changes, there will be no cache hit and everything will be built from scratch. As it turns out, this is actually the most common scenario.

A better caching solution is one where the caching is done on individual dependencies rather than on all the dependencies as a huge blob. Caching individual dependencies means that when a high level dependency changes, there is a very high likelihood that all the lower level dependencies will still be found in the cache.

My initial implementation of this package level caching was done as a simple wrapper around `cabal` that used `rsync` to fetch and save the cache over `ssh` to another machine. This proved highly effective and I was able to populate the cache from one machine and use if from another (both machines running Debian Linux).

However, @angerman came up with an even better solution that required adding the ability to run shell scripts before and after the build of each individual package. Using this feature (we have rough patches against cabal `HEAD` and version `3.10.3.0`) we are able to use our own Amazon S3 storage for our cache. We do not propose to make this S3 storage public (obvious potential security problems) but any organization like ours or any individual could use their own S3 storage. I also have a working pair of pre and post build hooks that use `ssh` to a different machine as the storage backend.

The naive patch against `HEAD` (error handling could be improved, maybe the hook names could be made configurable) is:
```
diff --git a/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs b/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
index 065334d5c..570c9b18c 100644
--- a/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
+++ b/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
@@ -678,7 +678,22 @@ buildAndInstallUnpackedPackage
           runConfigure
         PBBuildPhase{runBuild} -> do
           noticeProgress ProgressBuilding
-          runBuild
+          -- run preBuildHook. If it returns with 0, we assume the build was
+          -- successful. If not, run the build.
+          code <- rawSystemExitCode verbosity (Just srcdir) "preBuildHook" [
+              (unUnitId $ installedUnitId rpkg)
+            , (getSymbolicPath srcdir)
+            , (getSymbolicPath builddir)
+            ] `catchIO` (\_ -> return (ExitFailure 10))
+          when (code /= ExitSuccess) $ do
+            runBuild
+            -- not sure, if we want to care about a failed postBuildHook?
+            void $ rawSystemExitCode verbosity (Just srcdir) "postBuildHook" [
+                (unUnitId $ installedUnitId rpkg)
+              , (getSymbolicPath srcdir)
+              , (getSymbolicPath builddir)
+              ] `catchIO` (\_ -> return (ExitFailure 10))
+
         PBHaddockPhase{runHaddock} -> do
           noticeProgress ProgressHaddock
           runHaddock
```
An example of our `preBuildHook` script (kept in `~/.cabal/iog-hooks`, S3 credentials pulled from `s3-credentials.bash`, the `aws` executable is from the `awscli` package) is as follows:
```
#!/usr/bin/env bash

# shellcheck disable=SC1091,SC2046
. $(dirname "${BASH_SOURCE[0]}")/s3-credentials.bash

CACHE_KEY="$1.tar.gz"

# Check if artifact exists in S3 (with endpoint and credentials)
aws s3 ls s3://"$CACHE_BUCKET"/"$CACHE_KEY" --endpoint-url "$AWS_ENDPOINT" > /dev/null 2>&1

# shellcheck disable=SC2181
if [ $? -eq 0 ]; then
  echo "S3 hit       $1"
  aws s3 cp s3://"$CACHE_BUCKET"/"$CACHE_KEY" - --endpoint-url "$AWS_ENDPOINT" | tar -xz
else
  echo "S3 miss      $1"
  exit 1
fi
```
To use the cache I run `cabal` as:
```
PATH=$PATH:$HOME/.cabal/iog-hooks cabal 
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Addition of generic pre build and post build hooks. #9892

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Addition of generic pre build and post build hooks. #9892

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions