Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for chatglm-6b #231

Closed
datalee opened this issue Jun 25, 2023 · 7 comments
Closed

Support for chatglm-6b #231

datalee opened this issue Jun 25, 2023 · 7 comments
Labels
new model Requests to new models

Comments

@datalee
Copy link

datalee commented Jun 25, 2023

It would be great if you could support chatglm-6b,It's a popular chinese model。
https://huggingface.co/THUDM/chatglm-6b

@WoosukKwon WoosukKwon added the new model Requests to new models label Jun 25, 2023
@zhaoying9105
Copy link

mark

2 similar comments
@HJT9328
Copy link

HJT9328 commented Jul 2, 2023

mark

@binarrii
Copy link

mark

@Jeffwan
Copy link
Contributor

Jeffwan commented Jul 31, 2023

If anyone is familiar with chatGLM model architecture, feel free to help on #625. I am new to transformer architecture and not sure if my changes is correct..

@cabbagetalk
Copy link

If anyone is familiar with chatGLM model architecture, feel free to help on #625. I am new to transformer architecture and not sure if my changes is correct..

Vllm can support chatglm 6b now?

@simon-mo
Copy link
Collaborator

simon-mo commented Nov 2, 2023

If anyone have bandwidth to help us implement ChatGLM3 support, please leave a comment and coordinate here: #1552

@hmellor
Copy link
Collaborator

hmellor commented Mar 6, 2024

ChatGLM supported in #1261

@hmellor hmellor closed this as completed Mar 6, 2024
yukavio pushed a commit to yukavio/vllm that referenced this issue Jul 3, 2024
SUMMARY:
* switch build phase to GCP k8s cluster

TEST PLAN:
runs on remote push

Co-authored-by: andy-neuma <andy@neuralmagic.com>
jikunshang pushed a commit to jikunshang/vllm that referenced this issue Sep 6, 2024
Removes unnecessary mark step from MoE OP loop to speed up computation
mht-sharma pushed a commit to mht-sharma/vllm that referenced this issue Oct 30, 2024
* add fp8 for dbrx

* linting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Requests to new models
Projects
None yet
Development

No branches or pull requests

9 participants