Skip to content

Project Limit Request: vllm-logits - 1 GB #6542

Open
@pshlego

Description

@pshlego

Project URL

https://pypi.org/project/vllm-logits

Does this project already exist?

  • Yes

New limit

1 GB

Update issue title

  • I have updated the title.

Which indexes

PyPI

About the project

vllm-logits is a Python package for high-throughput and memory-efficient inference of large language models (LLMs), extending the vLLM engine with additional support for extracting and manipulating logits during inference.

This package is intended for research use in large-scale multi-GPU settings, and includes custom CUDA/C++ extensions. The project is under active development at POSTECH and used internally for model analysis and token-level logit inspection tasks.

We expect the release size to grow modestly as we support more architectures and optional runtime extensions.

How large is each release?

One Linux wheel is approximately 486 MiB, mainly due to:

  • Compiled CUDA extensions
  • Integrated C++ inference kernels
  • Optional runtime components for token-wise logit output

We have already minimized the package by removing tests, examples, and unused data via MANIFEST.in, but further reductions would break required functionality.

No source tarball or other wheels are included at this point.

How frequently do you make a release?

Roughly once every 3 to 4 months, depending on LLM engine updates.

Code of Conduct

  • I agree to follow the PSF Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions