Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface, why i cannot import it?? QAQ #1280

Open
YANGTUOMAO opened this issue Oct 16, 2024 · 0 comments

Comments

@YANGTUOMAO
Copy link

Traceback (most recent call last):
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/train_mem.py", line 6, in
from llava.train.llama_flash_attn_monkey_patch import replace_llama_attn_with_flash_attn
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/llama_flash_attn_monkey_patch.py", line 12, in
from flash_attn.flash_attn_interface import flash_attn_unpadded_qkvpacked_func
ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface' (/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py)
Traceback (most recent call last):
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/train_mem.py", line 6, in
from llava.train.llama_flash_attn_monkey_patch import replace_llama_attn_with_flash_attn
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/llama_flash_attn_monkey_patch.py", line 12, in
from flash_attn.flash_attn_interface import flash_attn_unpadded_qkvpacked_func
ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface' (/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py)
Traceback (most recent call last):
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/train_mem.py", line 6, in
from llava.train.llama_flash_attn_monkey_patch import replace_llama_attn_with_flash_attn
File "/data1/yanzh/project/ml-mgie/LLaVA/llava/train/llama_flash_attn_monkey_patch.py", line 12, in
from flash_attn.flash_attn_interface import flash_attn_unpadded_qkvpacked_func
ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface' (/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py)
W1016 02:57:37.220000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606919 closing signal SIGTERM
W1016 02:57:37.221000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606920 closing signal SIGTERM
W1016 02:57:37.221000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606921 closing signal SIGTERM
W1016 02:57:37.221000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606922 closing signal SIGTERM
W1016 02:57:37.221000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606925 closing signal SIGTERM
W1016 02:57:37.221000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 2606926 closing signal SIGTERM
E1016 02:57:37.271000 140512597451840 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 4 (pid: 2606923) of binary: /data1/yanzh/.conda/envs/mgie/bin/python
Traceback (most recent call last):
File "/data1/yanzh/.conda/envs/mgie/bin/torchrun", line 8, in
sys.exit(main())
File "/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 348, in wrapper
return f(*args, **kwargs)
File "/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/torch/distributed/run.py", line 901, in main
run(args)
File "/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/torch/distributed/run.py", line 892, in run
elastic_launch(
File "/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 133, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data1/yanzh/.conda/envs/mgie/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

LLaVA/llava/train/train_mem.py FAILED

Failures:
[1]:
time : 2024-10-16_02:57:37
host : node28
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 2606924)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2024-10-16_02:57:37
host : node28
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 2606923)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant