Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache for Docker-Based Projects in CI #2588

Open
Andrew-Chen-Wang opened this issue May 5, 2020 · 6 comments
Open

Add cache for Docker-Based Projects in CI #2588

Andrew-Chen-Wang opened this issue May 5, 2020 · 6 comments

Comments

@Andrew-Chen-Wang
Copy link
Contributor

Description

CI takes a really long time to build the docker containers. Approximately 4 minutes on just building with so many requirements. Although CI like Travis in their docs say they don't support caching since each build is always provisioned on a new machine, we can take advantage of GitHub packages to push docker images there using GH actions.

An example is detailed here: https://testdriven.io/blog/faster-ci-builds-with-docker-cache/

The other option would be to rewrite a bit of config/settings/test.py and rewrite the CI configuration files to run without Docker and instead just "install" and run tests like a normal project.

Rationale

CI takes a really long time to build for Docker projects.

Use case(s) / visualization(s)

Faster docker builds during CI or, in general, faster CI runs. Plus, less whining.

@Andrew-Chen-Wang
Copy link
Contributor Author

Andrew-Chen-Wang commented May 11, 2020

Hm so it seems doable. Run the pip install on local.txt. It’ll cut down a large chunk of time since I believe 1/3 of the time is spent installing the python packages, another third on installing Postgres and other stuff, so using the built in Travis cache and services (e.g. redis and PostgreSQL addons that I learned is most suitable for use on Xenial) will help.

Celery can be executed like so:

pre_script:
    - celery multi start worker1 -A config.celery_app --pool=solo
    - celery multi start worker2 -A config.celery_app --pool=solo
post_script:
    - celery multi stop worker1 worker2

I can start a PR soon.

Edit: coming soon... my CI decreased by 1 minute with some more needed testing since my pytest is 5.4.0 which uninstalls the default Travis pytest on Xenial, which I have to use, which then increases the amount of time collecting packages and stuff which then increase the time of “downloading” packages when they’re in the cache...

@Andrew-Chen-Wang
Copy link
Contributor Author

@browniebroke Would this allow for a decrease in the number of docker images generated, too? I’m not quite acquainted with this celery multi start functionality, albeit already implemented in my Travis config, and I even think the “worker1 worker2” part is not needed, only needing for the worker and beat part.

Currently, there are 3 docker images that take up 1GB each by default if celery is used since the compose file utilizes anchors. Perhaps using these multi start commands could eliminate the need for these anchors? I guess the problem, which I haven’t tested yet, is the logging...

@arnav13081994
Copy link
Contributor

@Andrew-Chen-Wang I have experimented with using cache for github actions. And it seems the best (and easiest) solution to decrease the time regardless of the CI tool used would be to use a docker image registry like dockerhub or github container image registry. This, however, would require one to add the following to the build directive of compose services in both local.yml and production.yml

Example:

Original:

services:
  django:{% if cookiecutter.use_celery == 'y' %} &django{% endif %}
    build:
      context: .
      dockerfile: ./compose/production/django/Dockerfile
    image: {{ cookiecutter.project_slug }}_production_django
    depends_on:
      - postgres
      - redis
    env_file:
      - ./.envs/.production/.django
      - ./.envs/.production/.postgres
    command: /start

Proposed:

services:
  django:{% if cookiecutter.use_celery == 'y' %} &django{% endif %}
    build:
      context: .
      dockerfile: ./compose/production/django/Dockerfile
      **cache_from:
        - [COOKIECUTTER_DOCKERHUB_REPO OR COOKIECUTTER_GITHUBREGISTRY_REPO]:[IMAGE_TAG_OR_NAME]**
    image: {{ cookiecutter.project_slug }}_production_django
    depends_on:
      - postgres
      - redis
    env_file:
      - ./.envs/.production/.django
      - ./.envs/.production/.postgres
    command: /start

And then we would need to add a corresponding step to login to the chosen registry on all CI tools before the step that executes docker-compose build.

The reason why this reduces the time is because before the images are built docker compose will first check the image registry to see what all image layers can be re-used to build the same image. This is because of cache_from which defaults to the local image registry (if not set) and will use the cache manifest from the specified registry to re-build only from the "busted" layers.

Also in the current avatar of the compose files, there is only 1 image built for django, celeryworker, celerybeat, and flower. If you run docker image ls, all 4 images will have the same ID. The 4 entries is just docker's way of saying that the same image is being referenced in more than 1 'repository'

Another option would be to use the docker buildx build to build django and delegate building other images to compose. This would not require any changes to the compose file. The build command however would either need to cache-from and cache-to the registry or the local CI tool cache.

@Andrew-Chen-Wang
Copy link
Contributor Author

Andrew-Chen-Wang commented Sep 23, 2020

@arnav13081994 I didn't actually want to use Docker at all just because most of the time was spent on installing OS/Dist packages and pip installing (just remember that downloading the whole Docker image and then using the image as a cache still takes time and then... if you updated your dependencies, you'll end up having to pip install via PyPi again).

So what ends up happening is:

  1. Download your image (which requires an external service. GitHub packages doesn't give you much space and Dockerhub gets you to pay for too many private repos IIRC). This means waiting 15-20 seconds to download 500MB of data
  2. Start Docker building: install apt packages if there were any changes. That's another 20 seconds if anything changes. It's not often but happens
  3. Pip install Python packages. If this happens, you're waiting approx. 90 seconds.

Now that there are doc services, that's another REALLY long time of building. So...

Ref #2637 or https://github.com/Donate-Anything/Donate-Anything for successful implementation of just completely disregarding Docker. The celery script is copied straight from Celery's Travis CI which Drew helpfully pointed out for me. When you use a CI's cache, their download is really quick. Then, when you're pip installing, they'll use that cache rather than pip installing from PyPi which would only take 45 seconds since some packages need to build (e.g. psycopg2).

@arnav13081994
Copy link
Contributor

arnav13081994 commented Sep 24, 2020

@Andrew-Chen-Wang Very good point! I experienced the same issue with Github Cache Actions as well which installs cache locally. And github allows 1 repo to have up to 5gb of cache. Funny enough all the time saved in image building was lost in exporting cache after the build suceeded! Which is why I ultimately decided to use a registry and still it definitely takes at least 3 minutes.

Thanks for pointing to that resource. I'll check it out.

@Andrew-Chen-Wang
Copy link
Contributor Author

@arnav13081994 No problem! Yea take a look at the PR and my repo. Tests for me take only 1.5 minutes total, so the PR definitely helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants