add xpu triton in dockerfile, or will show "Could not import Flash At… #2702

sywangyi · 2024-10-28T08:39:54Z

…tention enabled models: No module named 'triton'"

…tention enabled models: No module named 'triton'" Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

sywangyi · 2024-10-28T08:40:41Z

2024-10-28T08:34:35.936624Z INFO text_generation_launcher: Using attention paged - Prefix caching 0
2024-10-28T08:34:35.936631Z INFO text_generation_launcher: Default max_batch_prefill_tokens to 4096
2024-10-28T08:34:35.936747Z INFO download: text_generation_launcher: Starting check and download process for mistralai/Mistral-7B-v0.1
2024-10-28T08:34:39.809841Z INFO text_generation_launcher: Files are already present on the host. Skipping download.
2024-10-28T08:34:40.547887Z INFO download: text_generation_launcher: Successfully downloaded weights for mistralai/Mistral-7B-v0.1
2024-10-28T08:34:40.548403Z INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-10-28T08:34:43.836780Z INFO text_generation_launcher: Using prefix caching = False
2024-10-28T08:34:43.836810Z INFO text_generation_launcher: Using Attention = paged
2024-10-28T08:34:43.886426Z WARN text_generation_launcher: Could not import Flash Attention enabled models: No module named 'triton'
2024-10-28T08:34:43.886914Z WARN text_generation_launcher: Could not import Mamba: No module named 'mamba_ssm'
2024-10-28T08:34:46.605594Z INFO text_generation_launcher: Using experimental prefill chunking = False
2024-10-28T08:34:46.610208Z INFO text_generation_launcher: Server started at unix:///tmp/text-generation-server-0
2024-10-28T08:34:46.664621Z INFO shard-manager: text_generation_launcher: Shard ready in 6.107895071s rank=0
2024-10-28T08:34:46.753204Z INFO text_generation_launcher: Starting Webserver
2024-10-28T08:34:46.788056Z INFO text_generation_router_v3: backends/v3/src/lib.rs:125: Warming up model
2024-10-28T08:34:47.027548Z ERROR text_generation_launcher: Method Warmup encountered an error.
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.11/site-packages/text_generation_server/cli.py", line 116, in serve
server.serve(
File "/opt/conda/lib/python3.11/site-packages/text_generation_server/server.py", line 315, in serve
asyncio.run(

sywangyi · 2024-10-28T08:43:37Z

break tgi in xpu.

Narsil · 2024-10-30T13:19:29Z

Thanks LGTM ! Sorry for this !

add xpu triton in dockerfile, or will show "Could not import Flash At…

489e5b0

…tention enabled models: No module named 'triton'" Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Narsil approved these changes Oct 30, 2024

View reviewed changes

Narsil merged commit 46aeb08 into huggingface:main Oct 30, 2024

sywangyi mentioned this pull request Nov 21, 2024

[Volta] [No flash attention] Llama 3.1 8B Instruct failed to start - "< not supported between instances of 'NoneType' and 'int'" #2440

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add xpu triton in dockerfile, or will show "Could not import Flash At… #2702

add xpu triton in dockerfile, or will show "Could not import Flash At… #2702

sywangyi commented Oct 28, 2024

sywangyi commented Oct 28, 2024

sywangyi commented Oct 28, 2024

Narsil commented Oct 30, 2024

add xpu triton in dockerfile, or will show "Could not import Flash At… #2702

add xpu triton in dockerfile, or will show "Could not import Flash At… #2702

Conversation

sywangyi commented Oct 28, 2024

sywangyi commented Oct 28, 2024

sywangyi commented Oct 28, 2024

Narsil commented Oct 30, 2024