Skip to content
/ MeshGPT Public

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

License

Notifications You must be signed in to change notification settings

audi/MeshGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers


arXiv | Video | Project Page

This repository contains the implementation for the paper:

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers by Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, Matthias Nießner.

animated
MeshGPT creates triangle meshes by autoregressively sampling from a transformer model that has been trained to produce tokens from a learned geometric vocabulary. These tokens can then be decoded into the faces of a triangle mesh. Our method generates clean, coherent, and compact meshes, characterized by sharp edges and high fidelity.

Dependencies

Install requirements from the project root directory:

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.0+cu118.html
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install packaging
pip install -r requirements.txt

In case errors show up for missing packages, install them manually.

Structure

Overall code structure is as follows:

Folder Description
config/ hydra configs
data/ processed dataset
dataset/ pytorch datasets and dataloaders
docs/ project webpage files
inference/ scripts for inferencing trained model
model/ pytorch modules for encoder, decoder and the transformer
pretrained/ pretrained models on shapenet chairs and tables
runs/ model training logs and checkpoints go here in addition to wandb
trainer/ pytorch-lightning module for training
util/ misc utilities for positional encoding, visualization, logging etc.

Pre-trained Models and Data

Download the pretrained models and the data from here. Place them in the project, such that trained models are in pretrained/ directory and data is in data/shapenet directory.

Running inference

To run inference use the following command

python inference/infer_meshgpt.py <ckpt_path> <sampling_mode> <num_samples>

Examples:

# for chairs
python inference/infer_meshgpt.py pretrained/transformer_ft_03001627/checkpoints/2287-0.ckpt beam 25

# for tables
python inference/infer_meshgpt.py pretrained/transformer_ft_04379243/checkpoints/1607-0.ckpt beam 25

Training

For launching training, use the following command from project root

# vocabulary
python trainer/train_vocabulary.py <options> vq_resume=<path_to_vocabulary_ckpt>

# transformer
python trainer/train_transformer.py <options> vq_resume=<path_to_vocabulary_ckpt> ft_category=<category_id> ft_resume=<path_to_base_transformer_ckpt>

Some example trainings:

Vocabulary training

python trainer/train_vocabulary.py batch_size=32 shift_augment=True scale_augment=True wandb_main=True experiment=vq128 val_check_percent=1.0 val_check_interval=5 overfit=False max_epoch=2000 only_chairs=False use_smoothed_loss=True graph_conv=sage use_point_feats=False num_workers=24 n_embed=16384 num_tokens=131 embed_levels=2 num_val_samples=16 use_multimodal_loss=True weight_decay=0.1 embed_dim=192 code_decay=0.99 embed_share=True distribute_features=True

Base transformer training

# run over multiple GPUs (recommended GPUs >= 8), if you have a good budget, can use higher gradient_accumulation_steps

python trainer/train_transformer.py wandb_main=True batch_size=8 gradient_accumulation_steps=8 max_val_tokens=5000 max_epoch=2000 sanity_steps=0 val_check_interval=1 val_check_percent=1 block_size=4608 model.n_layer=24 model.n_head=16 model.n_embd=768 model.dropout=0 scale_augment=True shift_augment=True num_workers=24 experiment=bl4608-GPT2_m24-16-768-0_b8x8x8_lr1e-4 use_smoothed_loss=True num_tokens=131 vq_resume=<path_to_vocabulary_ckpt> padding=0

Transformer finetuning

# run over multiple GPUs (recommended GPUs >= 8), if you have a good budget, can use higher gradient_accumulation_steps

python trainer/train_transformer.py wandb_main=True batch_size=8 gradient_accumulation_steps=8 max_val_tokens=5000 max_epoch=2400 sanity_steps=0 val_check_interval=8 val_check_percent=1 block_size=4608 model.n_layer=24 model.n_head=16 model.n_embd=768 model.dropout=0 scale_augment=True shift_augment=True num_workers=24 experiment=bl4608-GPT2_m24-16-768-0_b8x8x8_FT04379243 use_smoothed_loss=True num_tokens=131 vq_resume=<path_to_vocabulary_ckpt> padding=0 num_val_samples=4 ft_category=04379243 ft_resume=<path_to_base_transformer_ckpt> warmup_steps=100

License

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers by Mohd Yawar Nihal Siddiqui is licensed under Automotive Development Public Non-Commercial License Version 1.0, however portions of the project are available under separate license terms: e.g. NanoGPT code is under MIT license.

Citation

If you wish to cite us, please use the following BibTeX entry:

@InProceedings{siddiqui_meshgpt_2024,
    title={MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers},
    author={Siddiqui, Yawar and Alliegro, Antonio and Artemov, Alexey and Tommasi, Tatiana and Sirigatti, Daniele and Rosov, Vladislav and Dai, Angela and Nie{\ss}ner, Matthias},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2024},
}

About

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages