Open
Description
As I was trying to run deepseek-ai's Janus on a Colab notebook, I encountered some flash-attention errors, including the one you mentioned in installing-flash-attention.md:
NameError: name '_flash_supports_window_size' is not defined
I couldn't resolve this specific error but managed to get it working by disabling flash-attention entirely for this model.
Fortunately, some guys at Xenova had already addressed this and uploaded a PR to Janus's Hugging Face Hub repository. You can use it by specifying the revision refs/pr/7
when downloading the pretrained model. For example:
import torch
from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM, VLChatProcessor
from janus.utils.io import load_pil_images
# specify the path to the model
revision_id = "refs/pr/7"
model_path = "deepseek-ai/Janus-1.3B"
vl_chat_processor: VLChatProcessor = VLChatProcessor.from_pretrained(model_path)
tokenizer = vl_chat_processor.tokenizer
vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
model_path, trust_remote_code=True, revision=revision_id
)
vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()
I just tried it with the official Janus Colab demo, and it worked like a charm! I thought you might appreciate this.
Metadata
Metadata
Assignees
Labels
No labels