RFC: TensorFlow Dockerfile Assembler #8

angerson · 2018-07-31T23:52:22Z

Review is now closed for comments

TensorFlow Dockerfile Assembler

Status	Accepted
Author(s)	Austin Anderson (angerson@google.com)
Sponsor	Gunhan Gulsoy (gunan@google.com)
Updated	2018-07-31

Summary

This document describes a new way to manage TensorFlow's dockerfiles. Instead of handling complexity via an on-demand build script, Dockerfile maintainers manage re-usable chunks called partials which are assembled into documented, standard, committed-to-repo Dockerfiles that don't need extra scripts to build. It is also decoupled from the system that builds and uploads the Docker images, which can be safely handled by separate CI scripts.

Important: This document is slim. The real meat of the design has already
been implemented in this PR to tensorflow/tensorflow.

ewilderj · 2018-08-01T19:35:42Z

rfcs/20180731-dockerfile-assembler.md

+
+| Status        | Proposed       |
+:-------------- |:---------------------------------------------------- |
+| **ID**        | <this will be allocated on approval>                 |


This ID line can be deleted, it's left in the template in error. The ID is actually the filename of the RFC. I've fixed the template now.

Fixed, thanks!

ewilderj · 2018-08-01T21:38:07Z

cc @gunan

flx42 · 2018-08-01T23:31:59Z

It would be nice to mention:

The current list of tags, with image sizes, and current limitations (e.g. the pip package making its way inside the image).
The new list of tags, with image sizes

flx42 · 2018-08-01T23:38:31Z

rfcs/20180731-dockerfile-assembler.md

+...which means that you can dynamically set multiple FROM images. My first
+draft used ARGs and FROMs in a single Dockerfile to manipulate build stages.
+[The resulting
+Dockerfile](https://gist.github.com/angersson/3d2b5ae6a01de4064b1c3fe7a56e3821)


The Dockerfile in this gist is truncated.
Also, I'm not sure why ARGs and multi-stage builds need to work together.

ARGs is for templating, allowing you to have a generic Dockerfile.
Multi-stage build is to avoid shipping with your build dependencies.

So I'm confused, for me they are totally different concepts.

Actually, when I took a quick look at the TF Dockerfiles, I thought multi-stage builds were not a good fit because:

devel images need to maintain those build dependencies

non-devel images already install from pip packages and don't have build dependencies (python deps excluded)

many dependencies are installed through the package manager (apt-get) which doesn't work with multi-stage builds anyway

I think I may have been in the middle of updating the file when I decided to drop that approach. I fixed the line.

I've revised this entire section since your comment (which illustrated that this portion of the doc was poorly explained, thanks) to clarify why I included this explanation. Regular multi-stage builds wouldn't be very useful at all, but the support for multiple FROMs that they added would let us do very evil things that don't work very well in the long run. I included the gist to show how bad such a Dockerfile can get (it's not meant to be useful aside from that).

flx42 · 2018-08-01T23:38:37Z

rfcs/20180731-dockerfile-assembler.md

+
+"Multi-stage Building" is a powerful new Dockerfile feature that supports
+multiple FROM statements in one Dockerfile. It is meant to be used for creating
+artifacts with one image before using those artifacts in another image, but you


I don't understand why you say "but" here. Is the intended meaning "in addition"?

It's a mixture of both -- I revised this whole section, so it should be better now. Thanks!

flx42 · 2018-08-02T02:28:41Z

Is it a requirement of this proposal that the end-user sees full Dockerfiles and can do a single docker build? Is it a common scenario? Do many users rebuild the docker images?

angerson · 2018-08-02T21:52:48Z

@flx42 Thanks for mentioning the tags. This design is meant to be a foundational change that lets us better approach discussions about what-tags-should-we-have; the images I've started with mirror the existing Dockerfiles and aren't trying to make great strides yet.

I've updated the doc to better explain this. I've also revised some sections that you commented on.

Is it a requirement of this proposal that the end-user sees full Dockerfiles and can do a single docker build? Is it a common scenario? Do many users rebuild the docker images?

I'm not sure how many users want this, but I've seen at least a handful of GitHub issues related to Dockerfile build issues -- because of that, and my own preferences working with Docker (I really like to see a simple Dockerfile, because it's easier to understand if all I want to do is build or learn), committing concrete files like this seems like a pretty good way to handle things.

flx42 · 2018-08-02T21:56:46Z

Thanks! Why I'm asking about the "single Dockerfile requirement" in #8 (comment) is because you can achieve the same kind of composition that you have with multiple small Dockerfiles.

For instance, if you have a generic FROM (templated with an ARG), the Jupyter Dockerfile becomes:

ARG from
FROM ${from}

ARG PIP
RUN ${PIP} install jupyter

This is similar to your approach, you have a generic component that adds Jupyter to an existing image. But the composition of these Dockerfiles is expressed through docker build calls instead of a custom yaml:

docker build -t nvidia-devel -f Dockerfile.nvidia-devel .
docker build -t nvidia-devel-jupyter-py3 --build-arg from=nvidia-devel --build-arg pip=pip3 -f Dockerfile.jupyter .

At the CI level, this can be templated too, for the python version for instance.

This doesn't introduce any other format/scripting on top of Docker tools, but you don't have this single generated Dockerfile.

You can still achieve layer sharing by doing the builds in the right order, with a "tree" of Dockerfiles were only the leaves are pushed tags.

base
├── dev
│   ├── cpu-devel
│   └── nvidia-devel
└── nodev
    ├── cpu
    └── nvidia

angerson · 2018-08-02T22:14:43Z

Ah, right, now I understand. I didn't really think about using multiple files with multiple FROMs. I'll add a new section to the doc that describes that method and some of the tradeoffs for it.

One of the obvious downsides I can see with any process that offloads build complexity to the docker build call is that, by definition, it complicates the process of assembling all of those images. For a developer, that means reading a README, which is fine. For us, internally, it means we'd need to design a build script that pieces everything together for whatever images we want to build, or else have a large list of very similar build chains, which may grow into a helper script anyway. The design I have here front-loads that process and isolates the "what stuff is in this image" work, which I really like.

The two end results would be quite similar, only different in where complexity lives (and it also looks like the multi-stage builds you've described might build a bit faster because of definitive re-use of stages compared to the implied re-use here). I've also already written all of the logic for this design, so only maintenance costs factor in to the extra effort.

Thanks for suggesting this!

ewilderj · 2018-08-23T17:48:43Z

Thanks everyone for your review and contribution!

Initial version of design

c012600

angerson requested review from ewilderj and martinwicke as code owners July 31, 2018 23:52

angerson mentioned this pull request Jul 31, 2018

Add new Dockerfile assembler based on partials tensorflow/tensorflow#21291

Merged

Cleanup proposal

f3bec26

ewilderj reviewed Aug 1, 2018

View reviewed changes

Cleanup ID field

27520a5

ewilderj added the RFC: Proposed RFC Design Document label Aug 1, 2018

ewilderj changed the title ~~Add assembled Dockerfiles based on partial files and spec~~ RFC: TensorFlow Dockerfile Assembler Aug 1, 2018

flx42 reviewed Aug 1, 2018

View reviewed changes

Update info about tags

c251fed

angerson added 2 commits August 2, 2018 14:49

Update with feedback from flx42

3e3302b

Note intent of design

64b9dff

Add discussion on alternate use of staged builds

fc1a5fe

angerson mentioned this pull request Aug 7, 2018

Support python3 on Docker image tensorflow/tensorflow:latest tensorflow/tensorflow#10179

Closed

angerson mentioned this pull request Aug 10, 2018

add debian stretch Dockerfile (CPU) tensorflow/tensorflow#21285

Closed

Marked as accepted

c348a3c

ewilderj added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Aug 23, 2018

ewilderj merged commit a8f3bae into tensorflow:master Aug 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: TensorFlow Dockerfile Assembler #8

RFC: TensorFlow Dockerfile Assembler #8

Uh oh!

angerson commented Jul 31, 2018 •

edited

Loading

Uh oh!

ewilderj Aug 1, 2018

Uh oh!

angerson Aug 1, 2018

Uh oh!

ewilderj commented Aug 1, 2018

Uh oh!

flx42 commented Aug 1, 2018

Uh oh!

flx42 Aug 1, 2018 •

edited

Loading

Uh oh!

angerson Aug 2, 2018

Uh oh!

flx42 Aug 1, 2018

Uh oh!

angerson Aug 2, 2018

Uh oh!

flx42 commented Aug 2, 2018

Uh oh!

angerson commented Aug 2, 2018 •

edited

Loading

Uh oh!

flx42 commented Aug 2, 2018

Uh oh!

angerson commented Aug 2, 2018

Uh oh!

ewilderj commented Aug 23, 2018

Uh oh!

Uh oh!

RFC: TensorFlow Dockerfile Assembler #8

RFC: TensorFlow Dockerfile Assembler #8

Uh oh!

Conversation

angerson commented Jul 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TensorFlow Dockerfile Assembler

Summary

Uh oh!

ewilderj Aug 1, 2018

Choose a reason for hiding this comment

Uh oh!

angerson Aug 1, 2018

Choose a reason for hiding this comment

Uh oh!

ewilderj commented Aug 1, 2018

Uh oh!

flx42 commented Aug 1, 2018

Uh oh!

flx42 Aug 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

angerson Aug 2, 2018

Choose a reason for hiding this comment

Uh oh!

flx42 Aug 1, 2018

Choose a reason for hiding this comment

Uh oh!

angerson Aug 2, 2018

Choose a reason for hiding this comment

Uh oh!

flx42 commented Aug 2, 2018

Uh oh!

angerson commented Aug 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

flx42 commented Aug 2, 2018

Uh oh!

angerson commented Aug 2, 2018

Uh oh!

ewilderj commented Aug 23, 2018

Uh oh!

Uh oh!

angerson commented Jul 31, 2018 •

edited

Loading

flx42 Aug 1, 2018 •

edited

Loading

angerson commented Aug 2, 2018 •

edited

Loading