-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
908797d
commit e2351ee
Showing
6 changed files
with
32 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,32 @@ | ||
# Project Structure (to be updated) | ||
Out-of-distribution with language supervision | ||
- Supported dataset: 'CIFAR-10', 'CIFAR-100', 'ImageNet', 'ImageNet10', 'ImageNet100', 'ImageNet-subset','ImageNet-dogs', 'bird200', 'car196','flower102','food101','pet37' | ||
- `eval_ood_detection.py`: Perform OOD detection. Supported scores: | ||
- 'Maha', 'knn', 'analyze', # img encoder only; feature space | ||
- 'energy', 'entropy', 'odin', # img->text encoder; feature space | ||
- 'MIP', 'MIPT','MIPT-wordnet', 'fingerprint', 'MIP_topk', # img->text encoder; feature space | ||
- 'MSP', 'energy_logits', 'odin_logits', # img encoder only; logit space | ||
- 'MIPCT', 'MIPCI', 'retrival', 'nouns' # text->img encoder; feature space | ||
|
||
- `play_with_clip.py`: ID zero-shot classification and ID fine-tuning (with img encoder). Currently we have three options: | ||
- evaluate zero shot performance of CLIP: call `zero_shot_evaluation_CLIP(image_dataset_name, test_labels, ckpt)` | ||
- fine-tune CLIP image encoder and test (linear probe): call `linear_probe_evaluation_CLIP(image_dataset_name)` | ||
- play with SkImages: call `play_with_skimage()` | ||
|
||
- `play_with_clip.ipynb`: contains various visualization methods for trained CLIP model. | ||
|
||
- `captions.ipynb`: Notebook used to generated captions using the Oscar model from Microsoft. This assumes you have | ||
cloned and installed [Oscar](https://github.com/microsoft/Oscar) and | ||
[scene\_graph\_benchmark](https://github.com/microsoft/scene_graph_benchmark) in the directory running the notebook | ||
from (you can change these directories in the notebook). | ||
# Delving into OOD Detection with Vision-Language Representations | ||
|
||
Recognizing out-of-distribution (OOD) samples is critical for machine learning systems deployed in the open world. The vast majority of OOD detection methods are driven by a single modality (e.g., either vision or language), leaving the rich information in multi-modal representations untapped. Inspired by the recent success of vision-language pre-training, this paper enriches the landscape of OOD detection from a single-modal to a multi-modal regime. Particularly, we propose Maximum Concept Matching (MCM), a simple yet effective zero-shot OOD detection method based on aligning visual features with textual concepts. We contribute in-depth analysis and theoretical insights to understand the effectiveness of MCM. Extensive experiments demonstrate that our proposed MCM achieves superior performance on a wide variety of real-world tasks. MCM with vision-language features outperforms a common baseline with pure visual features on a hard OOD task with semantically similar classes by 56.60% (FPR95). | ||
|
||
# Links | ||
|
||
ArXiv | ||
|
||
# Environment Setup | ||
|
||
# Data Preparation | ||
|
||
For complete information, refer to Appendix B.3 of the paper. | ||
|
||
## In-distribution Datasets | ||
|
||
- [`CUB-200`](http://www.vision.caltech.edu/datasets/cub_200_2011/), [`Standford-Cars`](http://ai.stanford.edu/~jkrause/cars/car_dataset.html), [`Food-101`](https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/), [`Oxford-Pet`](https://www.robots.ox.ac.uk/~vgg/data/pets/) | ||
- [`ImageNet`](https://image-net.org/challenges/LSVRC/2012/index.php#), `ImageNet-10`, `ImageNet-20` | ||
|
||
Please download ImageNet from the link; the other datasets can be automatically downloaded as the experiments run. The default dataset location is `./datasets/`, which can be changed in `settings.yaml`. The overall file structure: | ||
|
||
``` | ||
CLIP_OOD | ||
|-- datasets | ||
|-- ImageNet | ||
|-- ImageNet-10 | ||
|-- classlist.csv | ||
|-- ImageNet-20 | ||
|-- classlist.csv | ||
``` | ||
|
||
# Experiments |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters