Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce incremental Docker build times #27998

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

maxstanley
Copy link

Reduce incremental Docker build times and unnecessary duplication


The intent for this PR was to aid my local development on a seperate feature. I am currently running on a slow internet connection where downloading the go modules takes ~45 minutes. In it's current state, any code changes to Gitea results in the Dockerfile re-downloading all go modules.

This PR reduces incremental build times to enable the use of Docker as a development environment. This is achieved by copying in files such as go.mod and package-lock.json and downloading dependencies, before copying in gitea source files.

I will admit to a drive by refactor in this change, which combines Dockerfile and Dockerfile.rootless to further aid in reducing build times whilst using Docker.

Happy to make any changes!

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 11, 2023
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 11, 2023
@maxstanley maxstanley changed the title Docker cache Reduce incremental Docker build times Nov 11, 2023
Dockerfile Outdated Show resolved Hide resolved

RUN addgroup -S -g 1000 git

FROM gitea-base AS gitea-rootless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this build work with the two tags in the same file? Would target: gitea in docker/build-push-action skip over these steps for gitea-rootless? I assume not, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see in the docker-dryrun workflow, that the first build-push-action build gitea-base and gitea. Then in the next build-push-action you can see the gitea-base layers have all been cached and only the gitea-rootless layers are run.

If not specifying the target when building, docker build will export the final image "as the one to run".

Dockerfile Outdated

RUN make deps-frontend

COPY . .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even further optimize by explicitly copying only what is needed here, which would enable us to remove .dockerignore as well. It's a bit risky though in case someone adds new root files/folders and forgets to add them here, but overall I'm in favor of it. Wonder what others think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would still want to keep the .dockerignore as this stops the files being moved into the docker build context which is created before the first COPY command, which helps to reduce docker build times.

I use explicit COPYing elsewhere to help reduce build times in other projects. Especially nice to separate frontend and backend compilation, for an even higher chance of caching.

As for the risk of forgetting to add new files/folders to the Dockerfile, I don't think this would be too bad as the docker build workflows should flag this issue pretty quickly.

Dockerfile Outdated

# Checkout version if set
COPY ./.git ./.git
RUN if [ -n "${GITEA_VERSION}" ]; then git checkout "${GITEA_VERSION}"; fi
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on removing this line, can see it has been untouched in this file for 5 years so guessing it might be a part of a workflow?

Disadavntage of including these lines is that any time a maintainer commits to git, they will be re-running the COPY ./.git ./.git step. Currently this happend transparently.

To replicate the functionality you can use this feature of docker build

docker build https://github.com/go-gitea/gitea.git#v1.20.5

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the makefile using git calls to determine the version information. This could (and tbh should) be pulled out for the reasons you mentioned that whenever git repo changes it invalidates the cache (even if you are attempting to build the same tag). We do some magic in the makefile to calculate the version, and this logic is used for more than just docker (eg xgo crossarch builds) which means it's kinda load bearing and changes to it need a closer eye when reviewing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the failed build, the .git folder isn't present during CI runs - I think this is more to do with the dyrun workflow needing to do a full checkout once it has detected a file has changed?

In the Dockerfile GITEA_VERSION is an ARG, from what I can tell these are only accessible within the Dockerfile, and aren't in the build as environment variables, so should be safe to take out?

I think in the GitHub release workflows, it is using the .git directory to determine version information.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we were to go down the route of not relying on .git for Docker builds, we may want to go down the route of something like....

ARG GITEA_VERSION
ENV GITEA_VERSION=${GITEA_VERSION:-develop}
RUN echo "${GITEA_VERSION}" > VERSION

This way, when developing you will get a nominal "develop" value for your Gitea version, but when compiling for release, it can be specified as a build argument?

Dockerfile Outdated
Comment on lines 79 to 85
ARG GITHUB_REF_NAME
ARG GITHUB_REF_TYPE
ARG DOCKER_GITEA_VERSION

ENV GITHUB_REF_NAME=${GITHUB_REF_NAME:-docker-develop}
ENV GITHUB_REF_TYPE=${GITHUB_REF_TYPE:-branch}
ENV DOCKER_GITEA_VERSION=${DOCKER_GITEA_VERSION-${GITHUB_REF_NAME}}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change to highlight from the last couple commits.

We are forwarding the information provided in the GitHub Action envrionment to the docker build so that it can be utilised in the Makefile in place of the .git/ directory.

For tags alone, the Makefile could have remained untouched. However, due to the way the nightly version is constructed, the git describe result has to be passed in and handled seperately to the currenlty available mechanisms.

@silverwind silverwind added the topic/build PR changes how Gitea is built, i.e. regarding Docker or the Makefile label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. modifies/internal size/L Denotes a PR that changes 100-499 lines, ignoring generated files. topic/build PR changes how Gitea is built, i.e. regarding Docker or the Makefile
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants