Project Limit Request: vllm-logits - 1 GB

### Project URL

https://pypi.org/project/vllm-logits

### Does this project already exist?

- [x] Yes

### New limit

1 GB

### Update issue title

- [x] I have updated the title.

### Which indexes

PyPI

### About the project


`vllm-logits` is a Python package for high-throughput and memory-efficient inference of large language models (LLMs), extending the [vLLM](https://github.com/vllm-project/vllm) engine with additional support for extracting and manipulating logits during inference.

This package is intended for research use in large-scale multi-GPU settings, and includes custom CUDA/C++ extensions. The project is under active development at POSTECH and used internally for model analysis and token-level logit inspection tasks.

We expect the release size to grow modestly as we support more architectures and optional runtime extensions.

### How large is each release?

One Linux wheel is approximately **486 MiB**, mainly due to:
- Compiled CUDA extensions
- Integrated C++ inference kernels
- Optional runtime components for token-wise logit output

We have already minimized the package by removing tests, examples, and unused data via `MANIFEST.in`, but further reductions would break required functionality.

No source tarball or other wheels are included at this point.

### How frequently do you make a release?

Roughly once every 3 to 4 months, depending on LLM engine updates.


### Code of Conduct

- [x] I agree to follow the PSF Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Project Limit Request: vllm-logits - 1 GB #6542

Project URL

Does this project already exist?

New limit

Update issue title

Which indexes

About the project

How large is each release?

How frequently do you make a release?

Code of Conduct

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Project Limit Request: vllm-logits - 1 GB #6542

Description

Project URL

Does this project already exist?

New limit

Update issue title

Which indexes

About the project

How large is each release?

How frequently do you make a release?

Code of Conduct

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions