Homepage: https://asplos-conference.org/2024/
- SpotServe: Serving Generative Large Language Models on Preemptible Instances [Personal Notes] [Paper] [Code]
- Distributed LLM serving system on preemptible/spot instances
- Techniques
- Dynamically adapt the LLM parallelization configuration
- Minimize the cost of migrating instances for dynamic re-parallelization
- Stateful inference recovery
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling [Paper]
- UMass-Amherst & Nokia Bell Labs
- Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
- UMacau