RveRNet

Official implementation of RveRNet

The structure of the proposed RveRNet

We used the robust SAM foundation model to segment the ROI of input images. Then, we processed the images to produce complementary cut-out pairs that were used as inputs for both the ROI and extra-ROI modules. The ROI and extra-ROI modules can have different architectures that encode different inductive biases.

We trained and evaluated our proposed RveRNet on the preprocessed FoodSeg103 dataset. To quantify the advantage of our proposed model’s architectures’ unique inductive biases, we avoided selecting a dataset that was too large for fine-tuning. Preprocessing the selected dataset created complementary cut-out images that masked the ROIs and were input into the extra-ROI module while the ROI images were input into the ROI module.

In addition, to determine the degree to which the extra-ROI module in RveRNet enhanced the classification performance of ambiguous foods, we added images of ketchup and chili paste either photographed or collected from the internet, to the dataset. There were 69 ketchup and 72 chili paste train images and 38 ketchup and 34 chili paste test images. Thus, 18,320 train images and 7,769 test images across 105 categories were used in this study.

Unless otherwise specified, the train image dimensions were $224\times 224$ pixels . The off-the-shelf MobileNetV2, ViT, DeiT, DeiT-distill, and MLP-Mixer models were pre-trained on ImageNet-1K for each RveRNet module. We optimized the proposed model’s parameters using the Adam optimizer, a linear warm-up, and a cosine learning rate decay scheduler. We applied a random horizontal flip to a train set with a 50$%$ probability. The training batches contained 50 images and had an initial learning rate of $4\times 10^{-3}$. Top-1 accuracy and F1 scores on the test set were calculated for models, which were fine-tuned over 30 epochs using one NVIDIA A100 GPU in Google Colaboratory Pro.

Here is the structure of FoodSeg103 dataset folders:

FoodSeg103/
   |--Images/
   |    |--ann_dir/
   |    |__img_dir/
   |
   |--ImageSets/
   |    |--test.txt
   |    |__train.txt
   |
   |--category_id.txt
   |--Readme.txt
   |--test_recipe1m_id.txt
   |__train_test_recipe1m_id.txt

Your data directory for RveRNet should be like this after preprocessing:

dataset_root/
    |
    |--train/
    |    |--roi/
    |    |    |--category1/
    |    |    |    |--image_name1.jpg
    |    |    |    |--image_name2.jpg
    |    |    |    |--image_name3.jpg
    |    |    |    |__...
    |    |    |    
    |    |    |--category2/
    |    |    |--category3/
    |    |    |__...
    |    |
    |    |__extra-roi/
    |         |--category1/
    |         |    |--image_name1.jpg
    |         |    |--image_name2.jpg
    |         |    |--image_name3.jpg
    |         |    |__...
    |         |    
    |         |--category2/
    |         |--category3/
    |         |__...
    | 
    |
    |__test/
         |--roi/
         |    |--category1/          
         |    |--category2/
         |    |--category3/
         |    |__...
         |
         |__extra-roi/  
              |--category1/ 
              |--category2/
              |--category3/
              |__...

For an off-the-shelf model training, run :

python3 train_Off-the-shelf.py --config=./Off_the_shelfs/train_cfgs/train_config.yaml

For a RveRNet training, run :

python3 train_RveRNet.py --config=./RveRNets/train_cfgs/train_config_FoodSeg103.yaml

For inference of RveRNet, run:

python3 inference.py --config=./RveRNets/test_cfgs/test_config.yaml

In the configuration YAML file, you can choose the model for inference using ckpt_path. If you want batch inference for your test dataset, set batch_inference to True.

If you use this code for a paper please cite:

@misc{jin2024knowledgedistillationeffectivelyattain,
      title={Knowledge distillation to effectively attain both region-of-interest and global semantics from an image where multiple objects appear}, 
      author={Seonwhee Jin},
      year={2024},
      eprint={2407.08257},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.08257}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Off_the_shelfs		Off_the_shelfs
RveRNets		RveRNets
configs		configs
readme_srcs		readme_srcs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
train_Off-the-shelf.py		train_Off-the-shelf.py
train_RveRNet.py		train_RveRNet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RveRNet

About

Releases

Packages

Languages

License

Seonwhee-Genome/RveRNet

Folders and files

Latest commit

History

Repository files navigation

RveRNet

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages