Skip to content

Conversation

@msmiatac
Copy link

What does this PR do?

  • Purpose: Add --submodel {vision,text,full} to the export CLI to allow exporting only the CLIP vision or text encoder. This change stems from the need to simplify CLIP model conversion for the DL Streamer project.
  • Motivation: DL Streamer benefits from separate CLIP submodules (vision/text) as standalone IRs. The flag removes downstream splitting complexity and leverages the existing TasksManager registry.
  • Implementation: Routes --submodel from the CLI to main_export. For Transformers CLIP under feature-extraction, main_export swaps to model.vision_model or model.text_model so TasksManager resolves the correct clip_vision_model/clip_text_model.
  • Documentation: Updates export.mdx with the new flag, usage, and examples for vision-only and text-only exports.
  • Tests: Adds CLIPSubmodelExportTest validating that the exported IRs expose the expected inputs (pixel_values for vision, input_ids for text).
  • Backward Compatibility: Default behavior remains unchanged (full export if the flag is omitted). No breaking changes.
  • Scope: Supports Transformers CLIP for feature-extraction. OpenCLIP has registry coverage and can be similarly wired in a follow-up to map its submodules; not included in this PR.
  • Impact: Cleaner exports, reducing integration overhead for DL Streamer and other consumers requiring separate CLIP encoders.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@IlyasMoutawwakil
Copy link
Member

Hi I would suggest using the variant API to control which part of the model gets exported see example of metaclip in optimum-onnx https://github.com/huggingface/optimum-onnx/blob/682988bc2fd972f4594fb66725fa7e7c6c2b628d/optimum/exporters/onnx/model_configs.py#L1280

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants