Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Various simplifications and typing fixes #5368

Merged
merged 3 commits into from
Jun 11, 2024

Conversation

njhill
Copy link
Member

@njhill njhill commented Jun 9, 2024

Noticed while working on other features, thought it would be cleaner to split into a separate PR.

Noticed while working on other features.
dim=0,
dtype=query_start_loc.dtype,
out=query_start_loc[1:])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tensors aren't used in the flashinfer case

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I approved too early. Looks like you broke the speculative decoding code.

@njhill
Copy link
Member Author

njhill commented Jun 10, 2024

Thanks @DarkLight1337! It was a small mistake, have now pushed a fix.

@DarkLight1337 DarkLight1337 merged commit a008629 into vllm-project:main Jun 11, 2024
103 checks passed
@njhill njhill deleted the some-cleanup branch June 11, 2024 02:30
tjohnson31415 added a commit to tjohnson31415/vllm that referenced this pull request Jun 11, 2024
* upstream/main: (126 commits)
  [Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)
  [Bugfix] OpenAI entrypoint limits logprobs while ignoring server defined --max-logprobs (vllm-project#5312)
  [Misc] Various simplifications and typing fixes (vllm-project#5368)
  [ci] Fix Buildkite agent path (vllm-project#5392)
  [Doc] Add documentation for FP8 W8A8 (vllm-project#5388)
  Bump version to v0.5.0 (vllm-project#5384)
  [Docs] Alphabetically sort sponsors (vllm-project#5386)
  [Docs] Add Docs on Limitations of VLM Support (vllm-project#5383)
  [ci] Mount buildkite agent on Docker container to upload benchmark results (vllm-project#5330)
  [ci] Use small_cpu_queue for doc build (vllm-project#5331)
  [Bugfix] Fix LLaVA-NeXT (vllm-project#5380)
  [Feature][Frontend]:  Continued `stream_options` implementation also in CompletionRequest (vllm-project#5319)
  [Model] Initial support for LLaVA-NeXT (vllm-project#4199)
  [Misc] Improve error message when LoRA parsing fails (vllm-project#5194)
  [misc][typo] fix typo (vllm-project#5372)
  [Frontend][Misc] Enforce Pixel Values as Input Type for VLMs in API Server (vllm-project#5374)
  [Misc] Update to comply with the new `compressed-tensors` config (vllm-project#5350)
  [Bugfix] Fix KeyError: 1 When Using LoRA adapters (vllm-project#5164)
  [Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (vllm-project#5047)
  [mis][ci/test] fix flaky test in test_sharded_state_loader.py (vllm-project#5361)
  ...
joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants