Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Update arg_utils.py with semi_structured_sparse_w16a16 #45

Merged

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented Feb 22, 2024

Tested by checking the help message in openai server:

python -m vllm.entrypoints.openai.api_server --help

Before:

  --sparsity {sparse_w16a16,None}, -s {sparse_w16a16,None}
                        Method used to compress sparse weights. If None, we first check the `sparsity_config`
                        attribute in the model config file. If that is None we assume the model weights are dense

After:

  --sparsity {None,sparse_w16a16,semi_structured_sparse_w16a16}, -s {None,sparse_w16a16,semi_structured_sparse_w16a16}
                        Method used to compress sparse weights. If None, we first check the `sparsity_config`
                        attribute in the model config file. If that is None we assume the model weights are dense

@mgoin mgoin changed the base branch from main to rs/bump-main-to-v0.3.2 February 23, 2024 20:29
@mgoin mgoin merged commit 757e48a into rs/bump-main-to-v0.3.2 Feb 23, 2024
2 checks passed
@mgoin mgoin deleted the arg-utils-semi_structured_sparse_w16a16 branch February 23, 2024 20:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants