Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add internlm model #528

Merged
merged 7 commits into from
Aug 8, 2023
Merged

add internlm model #528

merged 7 commits into from
Aug 8, 2023

Conversation

gqjia
Copy link
Contributor

@gqjia gqjia commented Jul 20, 2023

No description provided.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! Can you add your models to EADME.md and docs/source/models/supported_models.rst? Specifically, have you made sure that your implementation matches the official implementation? For example, does the greedy sampling results from this PR matches the official implementation?

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I tested internlm/internlm-chat-7b and it works pretty well!

@zhuohan123 zhuohan123 merged commit 735ecff into vllm-project:main Aug 8, 2023
@beyondguo
Copy link

Hi, why I still got: ValueError: Model architectures ['InternLMForCausalLM'] are not supported for now.

vllm version: 0.1.3

script:

from vllm import LLM
model = LLM(model='internlm/internlm-chat-7b', trust_remote_code=True)

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
rickyyx pushed a commit to rickyyx/vllm that referenced this pull request Oct 7, 2024
NOTE: This includes a couple importing order changes. It is because I made vllm.anyscale pakcages to go to the bottom to avoid merge conflict.

Allow to build via pip install -e .
Basic integration with an env var ANYSCALE_USE_SCRATCH=1
Working with Llama 7B
Basic testing
Batch working (but scratch only allows small number of batch now. And scratch doesn't have efficient batching yet)
It is working with both scratch sampler and vllm sampler
Sessions are cleaned based on LRU cache. It will be fixed in a couple weeks.
support prompt logprob and some sampler features (except beam search)
async execution like torch kernels
llama 3 + llama 2 works
Do input config validation
more thorough testing
Future TODO

preemption not working (future work)
It currently doesn't use kv cache allocated from vllm (not a strict requirement).
The PR needs cleanup before merging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants