Skip to content

[Tokenizer] Add tokenizer mode #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 28, 2023
Merged

[Tokenizer] Add tokenizer mode #298

merged 1 commit into from
Jun 28, 2023

Conversation

WoosukKwon
Copy link
Collaborator

Closes #281

This PR adds tokenizer_mode argument which can be either auto or slow. When it is slow, vLLM uses the slow tokenizer even if the fast tokenizer is available. This is required for some popular models, e.g., open llama.

@WoosukKwon WoosukKwon requested a review from zhuohan123 June 28, 2023 18:48
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@WoosukKwon WoosukKwon merged commit 998d9d1 into main Jun 28, 2023
@WoosukKwon WoosukKwon deleted the tokenizer-mode branch June 28, 2023 21:19
michaelfeil pushed a commit to michaelfeil/vllm that referenced this pull request Jul 1, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
Upstream sync 2024 06 11
(neuralmagic#288)

SUMMARY:

* Merge commits from
vllm-project@1197e02
to
vllm-project@114332b
* Our GCP test instances do not have gcc or clang installed. All of the
triton kernels rely on the gcc and clang to generate JITs. These are
still disabled (cc @andy-neuma). All are marked with:
```python 
@pytest.mark.skip("C compiler not installed in NM automation. "
                  "This codepath follows a triton pathway, which "
                  "JITs using clang or gcc. Since neither are installed "
                  "in our test instances, we need to skip this for now.")
```

Note that
vllm-project@1197e02
is NOT included in this merge.

COMPARE vs UPSTREAM:


https://github.com/neuralmagic/nm-vllm/compare/upstream-sync-2024-06-11..vllm-project:vllm:v0.5.0

---------

Signed-off-by: Ye Cao <caoye.cao@alibaba-inc.com>
Signed-off-by: kevin <kevin@anyscale.com>
Co-authored-by: Daniele <d.trifiro@me.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Ye Cao <952129620@qq.com>
Co-authored-by: Nadav Shmayovits <45605409+NadavShmayo@users.noreply.github.com>
Co-authored-by: chenqianfzh <51831990+chenqianfzh@users.noreply.github.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Daniil Arapov <59310708+Delviet@users.noreply.github.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Avinash Raj <avistylein3105@gmail.com>
Co-authored-by: Divakar Verma <137818590+divakar-amd@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Yuan <yuan.zhou@intel.com>
Co-authored-by: Kaiyang Chen <48289729+Kaiyang-Chen@users.noreply.github.com>
Co-authored-by: Kevin H. Luu <kevin@anyscale.com>
Co-authored-by: Breno Faria <breno@veltefaria.de>
Co-authored-by: Toshiki Kataoka <tos.lunar@gmail.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: zifeitong <zifei.tong@parasail.io>
Co-authored-by: Jie Fu (傅杰) <fujie_email@sina.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
Co-authored-by: DriverSong <31926998+DriverSong@users.noreply.github.com>
Co-authored-by: qiujiawei9 <qiujiawei9@jd.com>
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Alex Wu <alexanderwu@berkeley.edu>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
Co-authored-by: liuyhwangyh <liuyhwangyh@163.com>
Co-authored-by: mulin.lyh <mulin.lyh@taobao.com>
Co-authored-by: Matthew Goldey <matthew.goldey@gmail.com>
Co-authored-by: Jie Fu (傅杰) <jiefu@tencent.com>
Co-authored-by: Itay Etelis <92247226+Etelis@users.noreply.github.com>
Co-authored-by: limingshu <61349199+JamesLim-sy@users.noreply.github.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Calvinn Ng <39899397+Calvinnncy97@users.noreply.github.com>
Co-authored-by: team <calvinn.ng@ahrefs.com>
Co-authored-by: Cheng Li <pistasable@gmail.com>
Co-authored-by: Benjamin Kitor <bkitor@gmail.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
Co-authored-by: bnellnm <49004751+bnellnm@users.noreply.github.com>
Co-authored-by: Bla_ckB <50193121+BlackBird-Coding@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
wuhuikx pushed a commit to wuhuikx/vllm that referenced this pull request Mar 27, 2025
)

### What this PR does / why we need it?

The triton doesn't work with ascend. We should make sure it's
uninstalled in dockerfile

Backport: vllm-project/vllm-ascend#298
Closes: vllm-project/vllm-ascend#291

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
CI passed

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

garbage output from h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b
2 participants