Skip to content

RFC: TensorFlow Dockerfile Assembler #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 23, 2018
Merged

Conversation

angerson
Copy link
Contributor

@angerson angerson commented Jul 31, 2018

Review is now closed for comments

TensorFlow Dockerfile Assembler

Status Accepted
Author(s) Austin Anderson (angerson@google.com)
Sponsor Gunhan Gulsoy (gunan@google.com)
Updated 2018-07-31

Summary

This document describes a new way to manage TensorFlow's dockerfiles. Instead of handling complexity via an on-demand build script, Dockerfile maintainers manage re-usable chunks called partials which are assembled into documented, standard, committed-to-repo Dockerfiles that don't need extra scripts to build. It is also decoupled from the system that builds and uploads the Docker images, which can be safely handled by separate CI scripts.

Important: This document is slim. The real meat of the design has already
been implemented in this PR to tensorflow/tensorflow.


| Status | Proposed |
:-------------- |:---------------------------------------------------- |
| **ID** | <this will be allocated on approval> |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ID line can be deleted, it's left in the template in error. The ID is actually the filename of the RFC. I've fixed the template now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

@ewilderj ewilderj added the RFC: Proposed RFC Design Document label Aug 1, 2018
@ewilderj ewilderj changed the title Add assembled Dockerfiles based on partial files and spec RFC: TensorFlow Dockerfile Assembler Aug 1, 2018
@ewilderj
Copy link
Contributor

ewilderj commented Aug 1, 2018

cc @gunan

@flx42
Copy link

flx42 commented Aug 1, 2018

It would be nice to mention:

  • The current list of tags, with image sizes, and current limitations (e.g. the pip package making its way inside the image).
  • The new list of tags, with image sizes

...which means that you can dynamically set multiple FROM images. My first
draft used ARGs and FROMs in a single Dockerfile to manipulate build stages.
[The resulting
Dockerfile](https://gist.github.com/angersson/3d2b5ae6a01de4064b1c3fe7a56e3821)
Copy link

@flx42 flx42 Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile in this gist is truncated.
Also, I'm not sure why ARGs and multi-stage builds need to work together.

ARGs is for templating, allowing you to have a generic Dockerfile.
Multi-stage build is to avoid shipping with your build dependencies.

So I'm confused, for me they are totally different concepts.

Actually, when I took a quick look at the TF Dockerfiles, I thought multi-stage builds were not a good fit because:

  • devel images need to maintain those build dependencies
  • non-devel images already install from pip packages and don't have build dependencies (python deps excluded)
  • many dependencies are installed through the package manager (apt-get) which doesn't work with multi-stage builds anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I may have been in the middle of updating the file when I decided to drop that approach. I fixed the line.

I've revised this entire section since your comment (which illustrated that this portion of the doc was poorly explained, thanks) to clarify why I included this explanation. Regular multi-stage builds wouldn't be very useful at all, but the support for multiple FROMs that they added would let us do very evil things that don't work very well in the long run. I included the gist to show how bad such a Dockerfile can get (it's not meant to be useful aside from that).


"Multi-stage Building" is a powerful new Dockerfile feature that supports
multiple FROM statements in one Dockerfile. It is meant to be used for creating
artifacts with one image before using those artifacts in another image, but you
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you say "but" here. Is the intended meaning "in addition"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a mixture of both -- I revised this whole section, so it should be better now. Thanks!

@flx42
Copy link

flx42 commented Aug 2, 2018

Is it a requirement of this proposal that the end-user sees full Dockerfiles and can do a single docker build? Is it a common scenario? Do many users rebuild the docker images?

@angerson
Copy link
Contributor Author

angerson commented Aug 2, 2018

@flx42 Thanks for mentioning the tags. This design is meant to be a foundational change that lets us better approach discussions about what-tags-should-we-have; the images I've started with mirror the existing Dockerfiles and aren't trying to make great strides yet.

I've updated the doc to better explain this. I've also revised some sections that you commented on.

Is it a requirement of this proposal that the end-user sees full Dockerfiles and can do a single docker build? Is it a common scenario? Do many users rebuild the docker images?

I'm not sure how many users want this, but I've seen at least a handful of GitHub issues related to Dockerfile build issues -- because of that, and my own preferences working with Docker (I really like to see a simple Dockerfile, because it's easier to understand if all I want to do is build or learn), committing concrete files like this seems like a pretty good way to handle things.

@flx42
Copy link

flx42 commented Aug 2, 2018

Thanks! Why I'm asking about the "single Dockerfile requirement" in #8 (comment) is because you can achieve the same kind of composition that you have with multiple small Dockerfiles.

For instance, if you have a generic FROM (templated with an ARG), the Jupyter Dockerfile becomes:

ARG from
FROM ${from}

ARG PIP
RUN ${PIP} install jupyter

This is similar to your approach, you have a generic component that adds Jupyter to an existing image. But the composition of these Dockerfiles is expressed through docker build calls instead of a custom yaml:

docker build -t nvidia-devel -f Dockerfile.nvidia-devel .
docker build -t nvidia-devel-jupyter-py3 --build-arg from=nvidia-devel --build-arg pip=pip3 -f Dockerfile.jupyter .

At the CI level, this can be templated too, for the python version for instance.

This doesn't introduce any other format/scripting on top of Docker tools, but you don't have this single generated Dockerfile.

You can still achieve layer sharing by doing the builds in the right order, with a "tree" of Dockerfiles were only the leaves are pushed tags.

base
├── dev
│   ├── cpu-devel
│   └── nvidia-devel
└── nodev
    ├── cpu
    └── nvidia

@angerson
Copy link
Contributor Author

angerson commented Aug 2, 2018

Ah, right, now I understand. I didn't really think about using multiple files with multiple FROMs. I'll add a new section to the doc that describes that method and some of the tradeoffs for it.

One of the obvious downsides I can see with any process that offloads build complexity to the docker build call is that, by definition, it complicates the process of assembling all of those images. For a developer, that means reading a README, which is fine. For us, internally, it means we'd need to design a build script that pieces everything together for whatever images we want to build, or else have a large list of very similar build chains, which may grow into a helper script anyway. The design I have here front-loads that process and isolates the "what stuff is in this image" work, which I really like.

The two end results would be quite similar, only different in where complexity lives (and it also looks like the multi-stage builds you've described might build a bit faster because of definitive re-use of stages compared to the implied re-use here). I've also already written all of the logic for this design, so only maintenance costs factor in to the extra effort.

Thanks for suggesting this!

@ewilderj ewilderj added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Aug 23, 2018
@ewilderj
Copy link
Contributor

Thanks everyone for your review and contribution!

@ewilderj ewilderj merged commit a8f3bae into tensorflow:master Aug 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC: Accepted RFC Design Document: Accepted by Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants