Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClipScore errors on captions with more than 77 tokens #2000

Closed
schopra8 opened this issue Aug 16, 2023 · 1 comment · Fixed by #2001
Closed

ClipScore errors on captions with more than 77 tokens #2000

schopra8 opened this issue Aug 16, 2023 · 1 comment · Fixed by #2001
Labels
bug / fix Something isn't working help wanted Extra attention is needed topic: Image v1.0.x
Milestone

Comments

@schopra8
Copy link

schopra8 commented Aug 16, 2023

🐛 Bug

If you run CLIPScore between an image and a caption, where the caption has more than 77 tokens (longer than the max string than CLIP can process) -- the clip score errors.

To Reproduce

Compute CLIPScore between a caption with 77+ tokens and an image.

Code sample
metric = CLIPScore(model_name_or_path="openai/clip-vit-base-patch32")
metric.to('cuda')
clip_score = metric(image_tensor, caption)
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user/scripts/compute_clip_scores.py", line 125, in <module>
    compute_clip_scores(response=response,
  File "/home/user/scripts/compute_clip_scores.py", line 87, in compute_clip_scores
    clip_score = metric(image_tensor, caption)
  File "/home/user/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/metric.py", line 288, in forward
    self._forward_cache = self._forward_full_state_update(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/metric.py", line 302, in _forward_full_state_update
    self.update(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/metric.py", line 456, in wrapped_func
    raise err
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/metric.py", line 446, in wrapped_func
    update(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/multimodal/clip_score.py", line 123, in update
    score, n_samples = _clip_score_update(images, text, self.model, self.processor)
  File "/home/user/.local/lib/python3.8/site-packages/torchmetrics/functional/multimodal/clip_score.py", line 69, in _clip_score_update
    txt_features = model.get_text_features(
  File "/home/user/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 1017, in get_text_features
    text_outputs = self.text_model(
  File "/home/user/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 730, in forward
    hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids)
  File "/home/user/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/user/.local/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 230, in forward
    embeddings = inputs_embeds + position_embeddings
RuntimeError: The size of tensor a (138) must match the size of tensor b (77) at non-singleton dimension 1

Expected behavior

Present a warning to the user and truncate the caption so that the metric can be computed on the first 77 tokens of the provided caption

Environment

  • TorchMetrics version (and how you installed TM, e.g. conda, pip, build from source): 1.0.3, pip
  • Python & PyTorch Version (e.g., 1.0): Python 3.8.10, PyTorch 2.0.1+cu118
  • Any other relevant information such as OS (e.g., Linux): Linux
@schopra8 schopra8 added bug / fix Something isn't working help wanted Extra attention is needed labels Aug 16, 2023
@github-actions
Copy link

Hi! thanks for your contribution!, great first issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working help wanted Extra attention is needed topic: Image v1.0.x
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants