Skip to content

Latest commit

 

History

History
107 lines (89 loc) · 4.91 KB

400-bug report.yml

File metadata and controls

107 lines (89 loc) · 4.91 KB
NameAboutLabelsAssignees
🐛 Bug reportRaise an issue here if you find a bug.bug

Before submitting an issue, please make sure the issue hasn't been already addressed by searching through the existing and past issues.

Please run the following and paste the output below.

wget https://raw.githubusercontent.com/vllm-project/vllm/main/collect_env.py
# For security purposes, please feel free to check the contents of collect_env.py before running it.
python collect_env.py

It is suggested to download and execute the latest script, as vllm might frequently update the diagnosis information needed for accurately and quickly responding to issues.

If you are facing crashing due to illegal memory access or other issues with model execution, vLLM may dump the problematic input of the model. In this case, you will see the message Error in model execution (input dumped to /tmp/err_xxx.pkl). If you see this message, please zip the file (because GitHub doesn't support .pkl file format) and upload it here. This will help us to reproduce the issue and facilitate the debugging process.

Please provide a clear and concise description of what the bug is.

If relevant, add a minimal example so that we can reproduce the error by running the code. It is very important for the snippet to be as succinct (minimal) as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did: avoid any external data, and include the relevant imports, etc. For example:

from vllm import LLM, SamplingParams

prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="facebook/opt-125m")

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

If the code is too long (hopefully, it isn't), feel free to put it in a public gist and link it in the issue: https://gist.github.com.

Please also paste or describe the results you observe instead of the expected results. If you observe an error, please paste the error message including the full traceback of the exception. It may be relevant to wrap error messages in ```triple quotes blocks```.

Please set the environment variable export VLLM_LOGGING_LEVEL=DEBUG to turn on more logging to help debugging potential issues.

If you experienced crashes or hangs, it would be helpful to run vllm with export VLLM_TRACE_FUNCTION=1 . All the function calls in vllm will be recorded. Inspect these log files, and tell which function crashes or hangs.

⚠️ Please separate bugs of transformers implementation or usage from bugs of vllm. If you think anything is wrong with the models' output:

  • Try the counterpart of transformers first. If the error appears, please go to their issues.
  • If the error only appears in vllm, please provide the detailed script of how you run transformers and vllm, also highlight the difference and what you expect.
    Thanks for contributing 🎉!
Before submitting a new issue...