Skip to content

Generative Multisensory Network for Neural Multisensory Scene Inference

License

Notifications You must be signed in to change notification settings

lim0606/pytorch-generative-multisensory-network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative Multisensory Network

Pytorch implementation of Generative Multisensory Network (GMN) on our paper:

Jae Hyun Lim, Pedro O. Pinheiro, Negar Rostamzadeh, Christopher Pal, Sungjin Ahn, Neural Multisensory Scene Inference (2019)

Introduction

Please check out our project website!

Getting Started

Requirements

python>=3.6
pytorch==0.4.x
tensorflow (for tensorboardX)
tensorboardX

Dataset

data from MESE

Structure

  • data: data folder
  • datasets: dataloader definitions
  • models: model definitions
  • utils: miscelleneous functions
  • cache: temporary files
  • eval: a set of python codes for evaluation / visualization
  • scripts: scripts for experiments
    ├── eval: eval/visualization scripts are here
    ├── train: training codes are here
    └── train_missing_modalities: training with missing modalities are here
        ├── m5
        ├── m8
        └── m14
  • main_multimodal.py: main function to train model

Experiments

Train

  • For example, you can train an APoE model for vision and haptic data (# of modalities = 2) as follows,
    python main_multimodal.py \
        --dataset haptix-shepard_metzler_5_parts \
        --model conv-apoe-multimodal-cgqn-v4 \
        --train-batch-size 12 --eval-batch-size 4 \
        --lr 0.0001 \
        --clip 0.25 \
        --add-opposite \
        --epochs 10 \
        --log-interval 100 \
        --exp-num 1 \
        --cache experiments/haptix-m2
    For more information, please find example scripts in scripts/train folder.

Classification (using learned model)

  • An example script to run classification with a learned model on held-out date can be written as follows:
    For the additional Shepard-Metzler objects with 4 or 6 parts (), 10-way classification.
    python eval/clsinf_multimodal_m2.py \
    --dataset haptix-shepard_metzler_46_parts \
    --model conv-apoe-multimodal-cgqn-v4 \
    --train-batch-size 10 --eval-batch-size 10 \
    --vis-interval 1 \
    --num-z-samples 50 \
    --mod-step 1 \
    --mask-step 1 \
    --cache clsinf.m2.s50/rgb/46_parts \
    --path <path-to-your-model>
    For more information, please find example scripts in scripts/eval folder.

Train with missing modalities

  • If you would like to run an APoE model for where , run following script,
    python main_multimodal.py \
        --dataset haptix-shepard_metzler_5_parts-48-ul-lr-rgb-half-intrapol1114 \
        --model conv-apoe-multimodal-cgqn-v4 \
        --train-batch-size 9 --eval-batch-size 4 \
        --lr 0.0001 \
        --clip 0.25 \
        --add-opposite \
        --epochs 10 \
        --log-interval 100 \
        --cache experiments/haptix-m14-intrapol1114
    For more information, please find example scripts in scripts/train_missing_modalities folder.

Contact

For questions and comments, feel free to contact Jae Hyun Lim.

License

MIT License

Reference

@article{jaehyun2019gmn,
  title     = {Neural Multisensory Scene Inference},
  author    = {Jae Hyun Lim and
               Pedro O. Pinheiro and
               Negar Rostamzadeh and
               Christopher J. Pal and
               Sungjin Ahn},
  journal   = {arXiv preprint arXiv:1910.02344},
  year      = {2019},
}

About

Generative Multisensory Network for Neural Multisensory Scene Inference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published