Skip to content

Latest commit

 

History

History
27 lines (18 loc) · 921 Bytes

File metadata and controls

27 lines (18 loc) · 921 Bytes

ASPLOS 2024

Meta Info

Homepage: https://asplos-conference.org/2024/

Papers

LLM Inference

  • SpotServe: Serving Generative Large Language Models on Preemptible Instances [Personal Notes] [Paper] [Code]
    • CMU & PKU & CUHK
    • Distributed LLM serving system on preemptible/spot instances
    • Techniques
      • Dynamically adapt the LLM parallelization configuration
      • Minimize the cost of migrating instances for dynamic re-parallelization
      • Stateful inference recovery

Model Serving

  • Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling [Paper]
    • UMass-Amherst & Nokia Bell Labs

Elastic Training

  • Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
    • UMacau