Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: remove docker-compose from det deploy local [MLG-37] #6386

Merged

Conversation

azhou-determined
Copy link
Contributor

@azhou-determined azhou-determined commented Mar 29, 2023

Description

replace docker-compose dependency (in det deploy local) with docker run, which is already used to spin up agents.

Test Plan

[ ] det deploy local cluster-up/down
[ ] det deploy local master-up/down

Migration path:
[ ] det deploy local cluster-up before this change, then det deploy local cluster-up after this change. DB should be persisted unless delete-db was specified.

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

harness/setup.py Outdated
"tqdm",
"appdirs",
# docker-compose has a requirement not properly propagated with semi-old pip installations;
# so we expose that requirement here.
"websocket-client<1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might consider the the whole comment and associated requirement, rather than just the line containing the text docker-compose lol

@azhou-determined azhou-determined force-pushed the remove-docker-compose branch 2 times, most recently from 14aa83b to af02758 Compare March 29, 2023 23:55
Comment on lines +256 to +261
Arg(
"--image-repo-prefix",
type=str,
default="determinedai",
help="prefix for the master image",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of adding this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. cluster-up and agent-up both have these, and 2) the method this CLI calls handle_master_up actually takes in args.image_repo_prefix even though we never let it be specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this cli is a public interface.

Do we want to offer an --image-repo-prefix? Why not just offer e.g. a --master-image and an --agent-image or something more direct? Or not add it to the public interface if we haven't thought out fully?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i don't disagree that master-image and agent-image might be better. my reasoning for this was simply to make it more consistent with what the API already does. we already take in image-repo-prefix for the other CLI calls, and it seemed like a bug that we didn't have this here. before, we still technically accepted the argument, we would just default inside of the method, making the defaults not visible and not configurable by the user. i can remove it if you think it's not an overall gain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust your judgement on this.

harness/setup.py Outdated
"lomond>=0.3.3",
"pathspec>=0.6.0",
"pyyaml",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this from? Don't we always use ruamel.yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope. do a global search for import yaml, you'll find a ton of references to pyyaml (yaml).

Copy link
Contributor

@rb-determined-ai rb-determined-ai Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just fix those instead? The pyyaml library has issues, and we set up a better solution that is much more widely used:

rb@bearasaurusrex ~/code/determined/harness/determined (git)-[master]
% git grep 'common.*import.*yaml'
cli/cli.py:from determined.common import api, yaml
cli/command.py:from determined.common import api, context, util, yaml
cli/experiment.py:from determined.common import api, context, set_logger, util, yaml
cli/master.py:from determined.common import api, yaml
cli/render.py:from determined.common import util, yaml
cli/template.py:from determined.common import api, util, yaml
common/experimental/determined.py:from determined.common import api, context, util, yaml
common/util.py:from determined.common import yaml
deploy/gke/cli.py:from determined.common import yaml
deploy/gke/cli.py:from determined.common.util import safe_load_yaml_with_exceptions
deploy/local/cluster_utils.py:from determined.common import yaml

vs

rb@bearasaurusrex ~/code/determined/harness/determined (git)-[master]
% git grep '^import yaml'
cli/job.py:import yaml
deploy/aws/gen_vcpu_mapping.py:import yaml
deploy/aws/preflight.py:import yaml

Copy link
Contributor

@rb-determined-ai rb-determined-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! New det deploy local operates perfectly on old-det deploy local-deployed cluster. Users are unlikely to even notice.

I'm still suspicious of that pyyaml dependency though.

@@ -29,6 +29,7 @@
"filelock",
"google-cloud-storage",
"hdfs>=2.2.2",
"jsonschema",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so to be clear, this is something we already depended on, indirectly, only now we are less strict about the version overall, and also it's a dependency we're going to remove in the nearish future, right?

Copy link
Contributor

@rb-determined-ai rb-determined-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so long overdue!

@azhou-determined azhou-determined changed the title chore: remove docker-compose dependency in det deploy local [MLG-37] chore: remove docker-compose from det deploy local [MLG-37] Apr 5, 2023
@azhou-determined azhou-determined merged commit ae5b933 into determined-ai:master Apr 5, 2023
azhou-determined added a commit that referenced this pull request Apr 5, 2023
@dannysauer dannysauer added this to the 0.21.1 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants