Skip to content

What is the reason for using only one GPU when integration with llm? #3114

Closed
@spencergotowork

Description

@spencergotowork

At line of the code, when using vllm, a unique GPU device is specified here. However, in fact, it is quite common to use a single vllm instance with multiple GPUs.

  1. What is the reason that the code is designed to only select a single GPU?
  2. Where does the 'device' parameter of this LLM interface eventually get passed to? When I entered this function, I couldn't find the corresponding parameter processing method (this might be a very basic question).
  3. When I changed the 'device' parameter to tensor_parallel_size (and also set the world_size and other parameters), an error occurred.

I've noticed that some other PRs have made modifications to the multi-GPU usage of vllm, but not at the interface where LLM is used. I'm curious about the reasons behind this.

If anyone is willing to answer me, I would be very grateful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ❓ questionSeeking clarification or more information🏋 GRPORelated to GRPO

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions