Ray Executor

In addition to accelerator support (e.g. via #304), Cubed could benefit ML users by providing [ray](https://github.com/ray-project/ray) executor: https://docs.ray.io/en/latest/ray-core/walkthrough.html

Since Cubed is a serverless model, I bet it could get away with only using [Tasks/remote functions](https://docs.ray.io/en/latest/ray-core/tasks.html#ray-remote-functions). 

From talking with @cromwellian a bit, my hope is that Cubed could provide memory bounds when trying to saturate GPUs during model training. I'm not totally sure exactly what a training loop with Cubed would look like. Here's how ray integrates with PyTorch, for example:  https://docs.ray.io/en/latest/train/api/doc/ray.train.torch.TorchTrainer.html#ray.train.torch.TorchTrainer

@shoyer pointed out to me once the idea that GPU OOM errors occur while taking the gradient of a function graph, not necessarily on the forward pass. I'm not totally sure right now if Cubed is in fact a good fit for tackling this problem, only that the potential is exciting. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ray Executor #488

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ray Executor #488

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions