Skip to content

Support job submission #2786

@zhijunfu

Description

@zhijunfu
  • Currently ray requires users to log into the cluster to submit their code, it would be more convenient to allow them to manage their jobs via a web portal.
  • And sometimes remote execution for actors/tasks defined in separate files would not work because the dependencies are not carried over. A general way to support this is to deploy the user code and dependencies to all the nodes in the cluster. There's a previous discussion on this here.

We propose to provide the functionality to allow user to submit, cancel, and get status for a job from a web portal, and also automatically deploy code and dependencies for a job at submission time. To implement we plan to add a http server which would run inside ray cluster, and an agent which runs on each node. Some of the issues have already been discussed in this issue.

More details are listed in this proposal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions