Skip to content

Conversation

Sunhill666
Copy link

Fixes #705

This pull request adds a new CLIPScore metric for evaluating the alignment between images and their corresponding text descriptions. The implementation includes the metric logic, a Gradio-based web interface, and documentation. The most important changes are summarized below:

New Metric Implementation:

  • Added clip_score.py implementing the CLIPScore metric using the CLIP model from HuggingFace Transformers, including logic for computing the similarity between image and text pairs.

Web Interface:

  • Introduced a Gradio app in app.py that allows users to interactively compute CLIPScore by uploading an image and entering text, with example inputs and integration of the metric.

Documentation:

  • Added a comprehensive README.md describing the metric, usage instructions, input/output formats, example code, and citation information.

Dependency Management:

  • Updated requirements.txt and setup.py to include the necessary transformers library for CLIPScore functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Metric request] Add CLIP Score to evaluate

1 participant