Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of Docker containers #1667

Merged
merged 1 commit into from
Mar 7, 2018
Merged

Reduce size of Docker containers #1667

merged 1 commit into from
Mar 7, 2018

Conversation

jameslamb
Copy link
Contributor

@jameslamb jameslamb commented Mar 7, 2018

What do these changes do?

Hello,

Thank you for this excellent project!

I've been reading through this repo trying to familiarize myself with the project, and saw something I may be able to help with. In this PR, I propose a few changes to the Dockerfiles in this repo to reduce container size and build time.

This PR reduces the size of base-deps by about 70 MB.

Proposals related to build time / size:

  • chaining together run commands
    • less build layers --> smaller image
  • removing vim from the dependencies
    • Users who want to run vim in the container to mess with files are welcome to add it in their own spin-off dockerfiles, but I believe the project should make the published container used to run Ray as small as possible

Proposals related to style and administration:

  • Putting dependency installation (e.g. with apt-get) on one line makes the git blame more usable and the Dockerfile more human-readable

Testing

I can confirm I was able to build the deploy and examples containers.

To test, I built the base-deps container as follows and then changed the deploy container to look at ray_new_deps (which it found locally):

docker build -t ray_deps_new -f Dockerfile .

I built deploy and examples using install-docker.sh at the repo root.

Future PR

I also want to propose removing git from these containers (since it isn't used in the build process, AFAICT), but didn't want to put that in this PR since it would involve changing some of the example documentation. If you are open to that change, I will make it in a follow-up PR.

Thank you for your consideration,

-James

Related issue number

This does not relate to any open issues.

@@ -1,14 +1,30 @@
# The deploy Docker image build a self-contained Ray instance suitable
# for end users.
# The base-deps Docker image installs main libraries needed to run Ray
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looked like the old comment on this file had been copy-pasted from the deploy container

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@robertnishihara
Copy link
Collaborator

Thanks @jameslamb this looks good to me!

I think we do use git in the build process, e.g., we git clone Arrow and then checkout specific commits.

cc @jssmith in case you want to take a look.

@jameslamb
Copy link
Contributor Author

@robertnishihara ahhhhhh ok I get it now, I see build.sh (which I know will drop down to build_arrow and build_parquet and other stuff using git) is getting run when I pip install . inside deploy (

subprocess.check_call(["../build.sh", sys.executable])
)

that's a cool trick. Glad I decided to leave git out of this PR!

@robertnishihara
Copy link
Collaborator

Yeah, it feels a bit messy in some ways... inside of setup.py we're running the build script, which in turn runs setup.py for pyarrow..

Anyway, thanks for the help!

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4166/
Test PASSed.

Copy link
Contributor

@jssmith jssmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes all look like sensible cleanup and enhancements me.

@@ -1,14 +1,30 @@
# The deploy Docker image build a self-contained Ray instance suitable
# for end users.
# The base-deps Docker image installs main libraries needed to run Ray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@robertnishihara robertnishihara merged commit 6dbf4f6 into ray-project:master Mar 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants