Skip to content

[Bugfix] Remove incorrect torchvision requirement from PIL backend image processors#45045

Merged
ArthurZucker merged 30 commits intohuggingface:mainfrom
Lidang-Jiang:fix/pil-backend-torchvision-requirement
Mar 30, 2026
Merged

[Bugfix] Remove incorrect torchvision requirement from PIL backend image processors#45045
ArthurZucker merged 30 commits intohuggingface:mainfrom
Lidang-Jiang:fix/pil-backend-torchvision-requirement

Conversation

@Lidang-Jiang
Copy link
Copy Markdown
Contributor

@Lidang-Jiang Lidang-Jiang commented Mar 27, 2026

Isolate dependencies, make PIL independant from Torchvision backend

Fixes #45042

PR #45029 added @requires(backends=("vision", "torch", "torchvision")) to 67 PIL backend image_processing_pil_*.py files. This causes PIL backend classes to become dummy objects when torchvision is not installed, making AutoImageProcessor unable to find any working processor — even though the PIL backend's purpose is to work without torchvision.

Fixes:

  • duplicate Kwarg import (PIL imported Kwargs from Torchvision equivalent file)
  • fix modular to inline Kwargs if used by both files
  • explicit import when it makes sense
  • explicit protection with requires(backends=("torch")) when file actually needs it.

…age processors

PR huggingface#45029 added @requires(backends=("vision", "torch", "torchvision")) to 67
PIL backend image_processing_pil_*.py files. This causes PIL backend classes
to become dummy objects when torchvision is not installed, making
AutoImageProcessor unable to find any working processor.

Fix: set @requires to ("vision",) for files that only need PIL, and
("vision", "torch") for files that also use torch directly. Also fix
5 modular source files so make fix-repo preserves the correct backends.

Fixes huggingface#45042

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker
Copy link
Copy Markdown
Collaborator

I'm having a look

…ckends

Per reviewer feedback: the vision-only @requires decorator is redundant
for PIL backend classes since PilBackend base class already handles this.

- Remove @requires(backends=("vision",)) from 43 PIL backend files
- Remove unused `requires` import from 38 files (Category A)
- Keep @requires(backends=("vision", "torch")) on method-level decorators (Category B: 5 files)
@Lidang-Jiang
Copy link
Copy Markdown
Contributor Author

CI Failure Analysis

The 3 test failures in tests_non_model are unrelated to this PR.

Failed tests

Test File
TestGetDecoder::test_vision_language_model tests/utils/test_modeling_utils.py
ModelUtilsTest::test_use_safetensors tests/utils/test_modeling_utils.py
ModelUtilsTest::test_model_from_pretrained_dtype tests/utils/test_modeling_utils.py

Root cause

All 3 failures share the same stack trace — a KeyError in _can_set_experts_implementation()_grouped_mm_can_dispatch() during model __init__:

modeling_utils.py:1263 → _check_and_adjust_experts_implementation
modeling_utils.py:1885 → get_correct_experts_implementation
modeling_utils.py:1938 → _grouped_mm_can_dispatch
modeling_utils.py:1733 → _can_set_experts_implementation  ← crash

This is a known pre-existing issue on main (tracked in #45003, fix in #45043). This PR only modifies @requires decorators in PIL backend image_processing_pil_*.py files and does not touch modeling_utils.py.

@ArthurZucker
Copy link
Copy Markdown
Collaborator

Thanks you agent, let me take over!

Comment on lines 34 to +35
if is_torch_available():
import torch
pass
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and dangling imports 😄

apart from it, looks fine to me, checked a few models

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah might be modular

Copy link
Copy Markdown
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it

@Lidang-Jiang
Copy link
Copy Markdown
Contributor Author

Thanks for taking over and cleaning up the imports further! I appreciate it 🙏 Just to clarify — I'm a real human, not an agent 😄 I used Claude Code as an assistive tool, but the analysis and decisions were mine. Happy to help with anything else if needed!

@ArthurZucker
Copy link
Copy Markdown
Collaborator

Oh sorry haha you were so fast I thought it was automated

@Lidang-Jiang
Copy link
Copy Markdown
Contributor Author

Oh sorry haha you were so fast I thought it was automated

The Codex is really fast! Codex always rejects Claude Code's commits. I'm planning to use Codex now. Then it would be fully automated and I wouldn't need to repeatedly prompt the AI. (In other PRs, I was often rejected by Codex several times and only got merged in the end.)

@ArthurZucker ArthurZucker added the for patch Tag issues / labels that should be included in the next patch label Mar 27, 2026
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

Other reviewers should not have to go through it all, just check my comments and tell me!

Comment on lines +43 to +44
# Adapted from transformers.models.fuyu.image_processing_fuyu.FuyuBatchFeature
class FuyuBatchFeature(BatchFeature):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed remove from modular for both

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc reviewer: only place I have a doubt maybe add require pytesseract?

Comment on lines +272 to 273
@requires(backends=("torch",))
class Mask2FormerImageProcessorPil(PilBackend):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yonigozlan 2 or 3 of them require torch -> why even have a PIL when torch is needed anyway?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird one will see if it can be cleaner?

Comment on lines +190 to 191
@requires(backends=("torch",))
class Owlv2ImageProcessorPil(PilBackend):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment, does not make sense if the entire processor needs torch -> remove it?


@auto_docstring
@requires(backends=("vision", "torch", "torchvision"))
@requires(backends=("torch",))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same


@auto_docstring
@requires(backends=("vision", "torch", "torchvision"))
@requires(backends=("torch",))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment

Comment on lines 34 to +35
if is_torch_available():
import torch
pass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah might be modular


@auto_docstring
@requires(backends=("vision", "torch", "torchvision"))
@requires(backends=("torch",))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about PIL needed at all?

Comment on lines +1751 to +1752
# Exception: for image_processing_pil files, image_processing modular classes must be inlined (not excluded),
# because these two files must never import from each other.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important change @yonigozlan and @Cyrilvallez

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an unintended dependency coupling where many PIL backend image processors were marked as requiring torchvision, causing them to become unavailable (dummy objects) in environments without torchvision, which in turn breaks AutoImageProcessor/AutoProcessor resolution.

Changes:

  • Removes/relaxes torchvision backend requirements on PIL image processor implementations, and adds narrower @requires(backends=("torch",)) guards only where PIL code truly needs PyTorch.
  • Inlines/shared-copies kwargs types and helper utilities into PIL modules to avoid cross-imports between image_processing_* and image_processing_pil_*.
  • Updates modular conversion + docstring generation logic to handle these dependency edges more safely.

Reviewed changes

Copilot reviewed 187 out of 187 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
utils/modular_model_converter.py Prevents cross-imports between image_processing and image_processing_pil during modular conversion; prefers modular-defined nodes when inlining.
src/transformers/utils/auto_docstring.py Makes pipeline-example lookup resilient to missing/optional auto-model attributes.
src/transformers/image_processing_backends.py Narrows PIL backend resize resample typing (but see review comment re: mismatch with implementation).
src/transformers/models/zoedepth/image_processing_zoedepth.py Adjusts torch/torchvision import/usage patterns for ZoeDepth torchvision backend.
src/transformers/models/zoedepth/image_processing_pil_zoedepth.py Removes torchvision requirement; inlines kwargs + resize helpers; adds torch-only requirement where needed.
src/transformers/models/yolos/image_processing_pil_yolos.py Removes torchvision requirement; inlines kwargs; narrows torch requirements to post-processing.
src/transformers/models/vitpose/image_processing_vitpose.py Updates import strategy for torch/torchvision backend implementation.
src/transformers/models/vitmatte/image_processing_vitmatte.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/vitmatte/image_processing_pil_vitmatte.py Removes torchvision requirement; inlines kwargs; simplifies torch typing.
src/transformers/models/vilt/image_processing_vilt.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/vilt/image_processing_pil_vilt.py Removes torchvision requirement; inlines kwargs; narrows PIL resample types.
src/transformers/models/videomae/image_processing_videomae.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/videomae/image_processing_pil_videomae.py Narrows PIL resample types.
src/transformers/models/video_llama_3/video_processing_video_llama_3.py Removes torchvision-conditional import usage from video processor surface.
src/transformers/models/video_llama_3/modular_video_llama_3.py Stops importing kwargs from qwen2_vl; inlines kwargs to avoid PIL↔torchvision coupling.
src/transformers/models/video_llama_3/modeling_video_llama_3.py Import formatting cleanup.
src/transformers/models/video_llama_3/image_processing_video_llama_3.py Removes torchvision-availability guard; adjusts typing.
src/transformers/models/video_llama_3/image_processing_pil_video_llama_3.py Removes torchvision requirement; inlines kwargs + smart_resize into PIL file.
src/transformers/models/tvp/image_processing_tvp.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/tvp/image_processing_pil_tvp.py Removes torchvision requirement; inlines kwargs; adjusts optional torchvision/torch handling.
src/transformers/models/textnet/image_processing_textnet.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/textnet/image_processing_pil_textnet.py Removes torchvision requirement; inlines kwargs; narrows PIL resample types.
src/transformers/models/swin2sr/image_processing_swin2sr.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/swin2sr/image_processing_pil_swin2sr.py Removes torchvision requirement; inlines kwargs; narrows PIL resample types.
src/transformers/models/superpoint/image_processing_superpoint.py Removes torchvision-availability guard and imports torchvision functional directly.
src/transformers/models/superpoint/image_processing_pil_superpoint.py Removes torchvision requirement; inlines kwargs; narrows torch requirement to post-processing.
src/transformers/models/superglue/image_processing_superglue.py Removes torchvision-availability guard and imports torchvision functional directly.
src/transformers/models/superglue/image_processing_pil_superglue.py Removes torchvision requirement; inlines validation helper; narrows torch requirement to post-processing/visualization.
src/transformers/models/smolvlm/processing_smolvlm.py Inlines copied constants to avoid vision gating for text-only processor import.
src/transformers/models/smolvlm/modular_smolvlm.py Introduces SmolVLMImageProcessorKwargs in modular file.
src/transformers/models/smolvlm/modeling_smolvlm.py Import formatting cleanup.
src/transformers/models/smolvlm/image_processing_smolvlm.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/siglip2/image_processing_siglip2.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/siglip2/image_processing_pil_siglip2.py Removes torchvision requirement; inlines kwargs + helper sizing function.
src/transformers/models/seggpt/image_processing_seggpt.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/seggpt/image_processing_pil_seggpt.py Removes torchvision requirement; inlines kwargs + palette builder; narrows torch requirement to post-processing.
src/transformers/models/segformer/modular_segformer.py Adds kwargs type + narrows backend requirements for PIL class.
src/transformers/models/segformer/image_processing_segformer.py Adjusts torch functional import usage.
src/transformers/models/segformer/image_processing_pil_segformer.py Removes torchvision requirement; inlines kwargs; narrows requires to torch+torchvision for PIL variant.
src/transformers/models/sam3_video/processing_sam3_video.py Guards torch import behind is_torch_available().
src/transformers/models/sam2/image_processing_sam2.py Import formatting cleanup.
src/transformers/models/sam/image_processing_sam.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/sam/image_processing_pil_sam.py Removes torchvision requirement; inlines kwargs; narrows requires to torch where used.
src/transformers/models/rt_detr/modular_rt_detr.py Removes torchvision functional import; inlines kwargs; narrows PIL requires to torch.
src/transformers/models/rt_detr/image_processing_rt_detr.py Changes torchvision import style and resample typing.
src/transformers/models/rt_detr/image_processing_pil_rt_detr.py Removes torchvision requirement; inlines kwargs; narrows requires to torch for post-processing.
src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/qwen2_vl/image_processing_pil_qwen2_vl.py Removes torchvision requirement; inlines kwargs + smart_resize.
src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py Removes torchvision availability check; imports torch+tvF unconditionally.
src/transformers/models/prompt_depth_anything/image_processing_pil_prompt_depth_anything.py Removes torchvision requirement; inlines kwargs; narrows torch requirement to post-processing.
src/transformers/models/pp_doclayout_v3/modular_pp_doclayout_v3.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/pp_doclayout_v3/image_processing_pp_doclayout_v3.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/pp_doclayout_v2/image_processing_pp_doclayout_v2.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/poolformer/image_processing_poolformer.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/poolformer/image_processing_pil_poolformer.py Removes torchvision requirement; inlines kwargs; narrows PIL resample types.
src/transformers/models/pixtral/image_processing_pixtral.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/pixtral/image_processing_pil_pixtral.py Removes torchvision requirement; inlines kwargs + resize math helpers.
src/transformers/models/phi4_multimodal/image_processing_phi4_multimodal.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/perception_lm/image_processing_perception_lm.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/perceiver/image_processing_pil_perceiver.py Narrows PIL resample typing.
src/transformers/models/perceiver/image_processing_perceiver.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/paddleocr_vl/modular_paddleocr_vl.py Removes kwargs import from qwen2_vl; inlines kwargs to avoid coupling.
src/transformers/models/paddleocr_vl/modeling_paddleocr_vl.py Import formatting cleanup.
src/transformers/models/paddleocr_vl/image_processing_pil_paddleocr_vl.py Removes torchvision requirement; inlines kwargs + smart_resize.
src/transformers/models/paddleocr_vl/image_processing_paddleocr_vl.py Removes torchvision availability check; adjusts typing.
src/transformers/models/owlv2/modular_owlv2.py Removes torchvision requirement from PIL class; adjusts typing for PIL preprocess path.
src/transformers/models/owlv2/image_processing_owlv2.py Narrows resample typing for torch backend.
src/transformers/models/ovis2/image_processing_ovis2.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/oneformer/image_processing_oneformer.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/nougat/image_processing_pil_nougat.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/nougat/image_processing_nougat.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/mobilevit/image_processing_pil_mobilevit.py Removes torchvision requirement; inlines kwargs; narrows torch requirement to post-processing.
src/transformers/models/mobilevit/image_processing_mobilevit.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/mobilenet_v2/image_processing_pil_mobilenet_v2.py Removes torchvision requirement; inlines kwargs; narrows torch requirement to post-processing.
src/transformers/models/mobilenet_v2/image_processing_mobilenet_v2.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/mllama/image_processing_mllama.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/maskformer/image_processing_maskformer.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/mask2former/modular_mask2former.py Inlines kwargs and removes torchvision requirement for PIL subclass.
src/transformers/models/mask2former/image_processing_mask2former.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/llava_onevision/modular_llava_onevision.py Inlines kwargs for PIL variant; adjusts resample typing.
src/transformers/models/llava_onevision/modeling_llava_onevision.py Import formatting cleanup.
src/transformers/models/llava_onevision/image_processing_pil_llava_onevision.py Removes torchvision requirement; inlines kwargs.
src/transformers/models/llava_onevision/image_processing_llava_onevision.py Narrows resample typing.
src/transformers/models/llava_next/image_processing_pil_llava_next.py Removes torchvision requirement; inlines kwargs.
src/transformers/models/llava_next/image_processing_llava_next.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/llava/image_processing_pil_llava.py Narrows PIL resample typing.
src/transformers/models/llava/image_processing_llava.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/llama4/image_processing_llama4.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/lighton_ocr/modeling_lighton_ocr.py Import formatting cleanup.
src/transformers/models/lightglue/modular_lightglue.py Removes torchvision requirement for PIL subclass; inlines kwargs.
src/transformers/models/lightglue/modeling_lightglue.py Import formatting cleanup.
src/transformers/models/lightglue/image_processing_lightglue.py Removes vision-availability gating for PIL imports; imports PIL objects directly.
src/transformers/models/lfm2_vl/image_processing_lfm2_vl.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/levit/image_processing_pil_levit.py Narrows PIL resample typing.
src/transformers/models/levit/image_processing_levit.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/layoutlmv3/image_processing_pil_layoutlmv3.py Removes torchvision requirement; inlines OCR helpers + kwargs; handles pytesseract availability locally.
src/transformers/models/layoutlmv3/image_processing_layoutlmv3.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/layoutlmv2/image_processing_pil_layoutlmv2.py Removes torchvision requirement; inlines OCR helpers + kwargs; handles pytesseract availability locally.
src/transformers/models/layoutlmv2/image_processing_layoutlmv2.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/kosmos2_5/image_processing_pil_kosmos2_5.py Removes torchvision requirement; inlines kwargs + torch patch extraction helper; adds torch-only requires.
src/transformers/models/janus/image_processing_pil_janus.py Removes torchvision requirement; inlines kwargs.
src/transformers/models/imagegpt/image_processing_pil_imagegpt.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/imagegpt/image_processing_imagegpt.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/idefics3/image_processing_idefics3.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/idefics2/image_processing_pil_idefics2.py Removes torchvision requirement; inlines kwargs + helpers; avoids vision gating for PIL import.
src/transformers/models/idefics2/image_processing_idefics2.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/idefics/image_processing_pil_idefics.py Removes torchvision requirement; inlines constants + kwargs.
src/transformers/models/grounding_dino/modular_grounding_dino.py Inlines kwargs + narrows requires for PIL post-processing to torch only.
src/transformers/models/grounding_dino/image_processing_pil_grounding_dino.py Removes torchvision requirement; inlines kwargs; narrows torch requirements to post-processing.
src/transformers/models/grounding_dino/image_processing_grounding_dino.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/got_ocr2/image_processing_got_ocr2.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/glpn/image_processing_pil_glpn.py Removes torchvision requirement; inlines kwargs; narrows torch requirements to post-processing.
src/transformers/models/glpn/image_processing_glpn.py Removes torchvision availability check; imports torch+tvF unconditionally.
src/transformers/models/glm_image/processing_glm_image.py Adds explicit @requires(backends=("torch",)) on processor class.
src/transformers/models/glm_image/modular_glm_image.py Adds explicit @requires(backends=("torch",)) on processor class; imports torch unconditionally.
src/transformers/models/glm_image/modeling_glm_image.py Imports torch unconditionally; removes is_torch_available() guard.
src/transformers/models/glm_image/image_processing_pil_glm_image.py Removes torchvision requirement; inlines kwargs + smart_resize.
src/transformers/models/glm_image/image_processing_glm_image.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/glm4v/image_processing_pil_glm4v.py Removes torchvision requirement; inlines smart_resize helper.
src/transformers/models/glm4v/image_processing_glm4v.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/glm46v/video_processing_glm46v.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/glm46v/image_processing_pil_glm46v.py Removes torchvision requirement; inlines kwargs + smart_resize.
src/transformers/models/glm46v/image_processing_glm46v.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/gemma3/image_processing_pil_gemma3.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/gemma3/image_processing_gemma3.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/efficientloftr/modular_efficientloftr.py Adds kwargs type; makes torch optional at import time; narrows requires to torch on post-processing.
src/transformers/models/efficientloftr/image_processing_efficientloftr.py Adjusts PIL/torch import strategy for torchvision backend.
src/transformers/models/dpt/modular_dpt.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/dpt/image_processing_dpt.py Imports torch.nn.functional and tvF unconditionally; removes availability guards.
src/transformers/models/donut/image_processing_pil_donut.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/dinov3_vit/image_processing_dinov3_vit.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/detr/image_processing_detr.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/depth_pro/image_processing_depth_pro.py Removes torchvision availability check; imports tvF unconditionally.
src/transformers/models/deformable_detr/modular_deformable_detr.py Inlines kwargs + narrows requires for PIL post-processing to torch only.
src/transformers/models/deformable_detr/modeling_deformable_detr.py Import formatting cleanup.
src/transformers/models/deformable_detr/image_processing_pil_deformable_detr.py Removes torchvision requirement; inlines kwargs; narrows torch requirements to post-processing.
src/transformers/models/deformable_detr/image_processing_deformable_detr.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/deepseek_vl_hybrid/modular_deepseek_vl_hybrid.py Adds numpy typing and narrows PIL resample typing in preprocess signatures.
src/transformers/models/deepseek_vl_hybrid/image_processing_pil_deepseek_vl_hybrid.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/deepseek_vl_hybrid/image_processing_deepseek_vl_hybrid.py Narrows PIL resample typing.
src/transformers/models/deepseek_vl/modular_deepseek_vl.py Introduces DeepseekVLImageProcessorKwargs and integrates ImagesKwargs.
src/transformers/models/deepseek_vl/image_processing_pil_deepseek_vl.py Removes torchvision requirement; inlines kwargs.
src/transformers/models/convnext/image_processing_pil_convnext.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/convnext/image_processing_convnext.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/conditional_detr/modular_conditional_detr.py Inlines kwargs + narrows requires for PIL post-processing to torch only.
src/transformers/models/conditional_detr/image_processing_conditional_detr.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/cohere2_vision/modeling_cohere2_vision.py Import formatting cleanup.
src/transformers/models/cohere2_vision/image_processing_cohere2_vision.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/chmv2/image_processing_chmv2.py Imports torch.nn.functional and tvF unconditionally; removes availability guards.
src/transformers/models/chameleon/image_processing_chameleon.py Imports torch and tvF unconditionally; removes availability guards.
src/transformers/models/bridgetower/image_processing_pil_bridgetower.py Removes torchvision requirement; inlines kwargs + resize helper; narrows PIL resample typing.
src/transformers/models/bridgetower/image_processing_bridgetower.py Moves to unconditional torchvision functional import for torchvision backend.
src/transformers/models/beit/image_processing_pil_beit.py Removes torchvision requirement; inlines kwargs; narrows torch requirements to post-processing.
src/transformers/models/beit/image_processing_beit.py Imports torch/tvF unconditionally; removes availability guards.
src/transformers/models/aria/modular_aria.py Imports tvF unconditionally; removes availability guards.
src/transformers/models/aria/modeling_aria.py Import formatting cleanup.
src/transformers/models/aria/image_processing_pil_aria.py Removes torchvision requirement; inlines kwargs; narrows PIL resample typing.
src/transformers/models/aria/image_processing_aria.py Moves to unconditional torchvision functional import for torchvision backend.
Comments suppressed due to low confidence (1)

src/transformers/image_processing_backends.py:543

  • The resample type annotation was narrowed to PILImageResampling | None, but the implementation still explicitly accepts int and also maps from torch/torchvision interpolation modes (via torch_pil_interpolation_mapping). Please keep the signature type hint aligned with actual accepted inputs (e.g., include int and tvF.InterpolationMode / Union[...]) to avoid misleading API typing.
        self,
        image: np.ndarray,
        size: SizeDict,
        resample: "PILImageResampling | None" = None,
        reducing_gap: int | None = None,
        **kwargs,
    ) -> np.ndarray:
        """Resize an image using PIL/NumPy."""
        # PIL backend only supports PILImageResampling
        if resample is not None and not isinstance(resample, (PILImageResampling, int)):
            if torch_pil_interpolation_mapping is not None and resample in torch_pil_interpolation_mapping:
                resample = torch_pil_interpolation_mapping[resample]
            else:
                resample = PILImageResampling.BILINEAR
        resample = resample if resample is not None else PILImageResampling.BILINEAR

Comment on lines +21 to +22
import torch
from torchvision.transforms.v2 import functional as tvF
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch/torchvision are imported unconditionally at module import time. Since these are optional dependencies and TorchvisionBackend's @requires(backends=(...)) only takes effect after the module imports, this can raise a raw ModuleNotFoundError (and break lazy imports) when torchvision isn't installed. Please revert to guarding these imports with is_torch_available() / is_torchvision_available() (or import inside methods) so missing deps surface via the standard requires_backends error path instead of crashing the import.

Copilot uses AI. Check for mistakes.
Comment on lines 16 to 18
import torch
from torchvision.transforms.v2 import functional as tvF

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module imports torchvision.transforms.v2.functional at import time. Because torchvision is optional, this can raise ModuleNotFoundError before @requires(backends=(...)) guards on classes are applied, breaking lazy imports and producing an unfriendly error. Please guard the torchvision import with is_torchvision_available() (and/or move it inside functions) and rely on requires_backends for the user-facing error.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to 44
from torchvision.transforms.v2 import functional as tvF

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from torchvision.transforms.v2 import functional as tvF is executed unconditionally (outside any is_torchvision_available() guard). Since torchvision is an optional dependency, this can raise at module import time and bypass the requires_backends mechanism. Please restore a guarded import (or defer the import until the first actual use) so missing torchvision results in a controlled, informative error rather than an import crash.

Copilot uses AI. Check for mistakes.
Comment on lines 30 to +34
from ...utils import TensorType, auto_docstring, is_torch_available
from ...utils.import_utils import requires
from .image_processing_vitmatte import VitMatteImageProcessorKwargs


if is_torch_available():
import torch
pass
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_torch_available is imported and then used only in an empty if is_torch_available(): pass block. This is dead code and can confuse readers about whether torch is actually required here. Please remove the empty guard and the unused import (or, if torch is needed for typing, use TYPE_CHECKING and quoted annotations instead).

Copilot uses AI. Check for mistakes.
Comment on lines +31 to 32
from torchvision.transforms.v2 import functional as tvF

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torchvision.transforms.v2.functional is imported unconditionally at module scope. Since torchvision is optional, this can raise ModuleNotFoundError during module import and bypass the library’s normal requires_backends/dummy-object behavior. Please guard this import with is_torchvision_available() (or defer it until use) so missing torchvision produces a controlled error only when the processor is actually used.

Copilot uses AI. Check for mistakes.
Comment on lines +47 to 48
from torchvision.transforms.v2 import functional as tvF

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file now imports torchvision.transforms.v2.functional at module import time. In environments without torchvision, that will crash the import (and can break _LazyModule access) instead of surfacing a standard requires_backends error when the processor is used. Please restore an is_torchvision_available() guard or move the import inside the code paths that need it.

Copilot uses AI. Check for mistakes.

return processed_images

@requires(backends=("torch",))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

cc @LysandreJik updated to work as such

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: aria, beit, bridgetower, chameleon, chmv2, cohere2_vision, conditional_detr, convnext, deepseek_vl, deepseek_vl_hybrid, deformable_detr, depth_pro, detr

Copy link
Copy Markdown
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok awesome!


return processed_images

@requires(backends=("torch",))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As seen with yu, I don't think this was initially designed for object-methods themselves, only for module-root objects

return encoded_inputs

@requires(backends=("vision", "torch"))
@requires(backends=("torch",))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome that it adds to the list rather than replaces it

@ArthurZucker ArthurZucker merged commit 2da00a3 into huggingface:main Mar 30, 2026
28 checks passed
NielsRogge pushed a commit to NielsRogge/transformers that referenced this pull request Mar 30, 2026
…age processors (huggingface#45045)

* [Bugfix] Remove incorrect torchvision requirement from PIL backend image processors

PR huggingface#45029 added @requires(backends=("vision", "torch", "torchvision")) to 67
PIL backend image_processing_pil_*.py files. This causes PIL backend classes
to become dummy objects when torchvision is not installed, making
AutoImageProcessor unable to find any working processor.

Fix: set @requires to ("vision",) for files that only need PIL, and
("vision", "torch") for files that also use torch directly. Also fix
5 modular source files so make fix-repo preserves the correct backends.

Fixes huggingface#45042

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [Bugfix] Remove redundant @requires(backends=("vision",)) from PIL backends

Per reviewer feedback: the vision-only @requires decorator is redundant
for PIL backend classes since PilBackend base class already handles this.

- Remove @requires(backends=("vision",)) from 43 PIL backend files
- Remove unused `requires` import from 38 files (Category A)
- Keep @requires(backends=("vision", "torch")) on method-level decorators (Category B: 5 files)

* update

* remove torch when its not necessary

* remove if typechecking

* fix  import shinanigans

* marvellous that's how we protect torch :)

* beit is torchvisionbackend

* more import cleanup

* fiixup

* fix-repo

* update

* style

* fixes

* up

* more

* fix repo

* up

* update

* fix imports

* style

* fix check copies

* arf

* converter up

* fix?

* fix copies

* fix for func

* style

* ignore

* type

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Arthur <arthur.zucker@gmail.com>
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Apr 4, 2026
…age processors (huggingface#45045)

* [Bugfix] Remove incorrect torchvision requirement from PIL backend image processors

PR huggingface#45029 added @requires(backends=("vision", "torch", "torchvision")) to 67
PIL backend image_processing_pil_*.py files. This causes PIL backend classes
to become dummy objects when torchvision is not installed, making
AutoImageProcessor unable to find any working processor.

Fix: set @requires to ("vision",) for files that only need PIL, and
("vision", "torch") for files that also use torch directly. Also fix
5 modular source files so make fix-repo preserves the correct backends.

Fixes huggingface#45042

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [Bugfix] Remove redundant @requires(backends=("vision",)) from PIL backends

Per reviewer feedback: the vision-only @requires decorator is redundant
for PIL backend classes since PilBackend base class already handles this.

- Remove @requires(backends=("vision",)) from 43 PIL backend files
- Remove unused `requires` import from 38 files (Category A)
- Keep @requires(backends=("vision", "torch")) on method-level decorators (Category B: 5 files)

* update

* remove torch when its not necessary

* remove if typechecking

* fix  import shinanigans

* marvellous that's how we protect torch :)

* beit is torchvisionbackend

* more import cleanup

* fiixup

* fix-repo

* update

* style

* fixes

* up

* more

* fix repo

* up

* update

* fix imports

* style

* fix check copies

* arf

* converter up

* fix?

* fix copies

* fix for func

* style

* ignore

* type

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Arthur <arthur.zucker@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PIL backend image processors incorrectly require torchvision in v5.4.0

6 participants