Skip to content

Exporting an environmentΒ #45

@limx0

Description

@limx0

Is it worth adding a couple of util scripts that would export the users current environment (conda or virtual env) and builds a docker image on top of daskdev/dask:latest?

Being able to pass a list of conda/pip packages is fine for relatively simple environments / prototyping but I can see value in something slightly more stable. Building a new image (Shouldn't need to be done too often) will increase the connection time to the KubeCluster, but will reduce the worker start up time.

I have some basic POC of this, which I am currently using, which looks something like

  • On Jupyterlab, build my conda env / validate locally in notebook
  • Export the environment
  • Build a docker image on top of daskdev/dask:latest, using something like
dockerfile_template = (
    'FROM daskdev/dask:latest\n'
    'ADD {environment_file} /opt/app/environment.yml\n'
    'RUN /opt/conda/bin/conda env update -n dask -f /opt/app/environment.yml && \ \n'
    '    conda clean -tipsy'
)

def build_publish_dockerfile(context_dir, dockerfile_txt, tag):
    with pathlib.Path(os.getcwd()).joinpath('dockerfile').open('w') as f:
        f.write(dockerfile_txt)
    client.images.build(
        path='.', dockerfile='dockerfile', tag='%s/%s' % (DOCKER_HUB_REPO, tag), nocache=True
    )


def image_from_conda_env(env_name, tag, conda_bin='conda'):
    with tempfile.TemporaryDirectory() as tmp_dir:
        env_file = pathlib.Path(tmp_dir).joinpath('environment.yml')
        export_conda_env(env_name, env_file, conda_bin)
        dockerfile = dockerfile_template.format(env_file)
        build_publish_dockerfile(tmp_dir, dockerfile_txt=dockerfile, tag=tag)

image_from_conda_env('myenv', 'dask-worker-myenv')

k = KubeCluster(image='dask-worker-myenv')

Is this in the works? Or any thoughts on the above?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions