arXiv | Video | Project Page
This repository contains the implementation for the paper:
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers by Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, Matthias Nießner.
Install requirements from the project root directory:
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.0+cu118.html
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install packaging
pip install -r requirements.txt
In case errors show up for missing packages, install them manually.
Overall code structure is as follows:
Folder | Description |
---|---|
config/ |
hydra configs |
data/ |
processed dataset |
dataset/ |
pytorch datasets and dataloaders |
docs/ |
project webpage files |
inference/ |
scripts for inferencing trained model |
model/ |
pytorch modules for encoder, decoder and the transformer |
pretrained/ |
pretrained models on shapenet chairs and tables |
runs/ |
model training logs and checkpoints go here in addition to wandb |
trainer/ |
pytorch-lightning module for training |
util/ |
misc utilities for positional encoding, visualization, logging etc. |
Download the pretrained models and the data from here. Place them in the project, such that trained models are in pretrained/
directory and data is in data/shapenet
directory.
To run inference use the following command
python inference/infer_meshgpt.py <ckpt_path> <sampling_mode> <num_samples>
Examples:
# for chairs
python inference/infer_meshgpt.py pretrained/transformer_ft_03001627/checkpoints/2287-0.ckpt beam 25
# for tables
python inference/infer_meshgpt.py pretrained/transformer_ft_04379243/checkpoints/1607-0.ckpt beam 25
For launching training, use the following command from project root
# vocabulary
python trainer/train_vocabulary.py <options> vq_resume=<path_to_vocabulary_ckpt>
# transformer
python trainer/train_transformer.py <options> vq_resume=<path_to_vocabulary_ckpt> ft_category=<category_id> ft_resume=<path_to_base_transformer_ckpt>
Some example trainings:
python trainer/train_vocabulary.py batch_size=32 shift_augment=True scale_augment=True wandb_main=True experiment=vq128 val_check_percent=1.0 val_check_interval=5 overfit=False max_epoch=2000 only_chairs=False use_smoothed_loss=True graph_conv=sage use_point_feats=False num_workers=24 n_embed=16384 num_tokens=131 embed_levels=2 num_val_samples=16 use_multimodal_loss=True weight_decay=0.1 embed_dim=192 code_decay=0.99 embed_share=True distribute_features=True
# run over multiple GPUs (recommended GPUs >= 8), if you have a good budget, can use higher gradient_accumulation_steps
python trainer/train_transformer.py wandb_main=True batch_size=8 gradient_accumulation_steps=8 max_val_tokens=5000 max_epoch=2000 sanity_steps=0 val_check_interval=1 val_check_percent=1 block_size=4608 model.n_layer=24 model.n_head=16 model.n_embd=768 model.dropout=0 scale_augment=True shift_augment=True num_workers=24 experiment=bl4608-GPT2_m24-16-768-0_b8x8x8_lr1e-4 use_smoothed_loss=True num_tokens=131 vq_resume=<path_to_vocabulary_ckpt> padding=0
# run over multiple GPUs (recommended GPUs >= 8), if you have a good budget, can use higher gradient_accumulation_steps
python trainer/train_transformer.py wandb_main=True batch_size=8 gradient_accumulation_steps=8 max_val_tokens=5000 max_epoch=2400 sanity_steps=0 val_check_interval=8 val_check_percent=1 block_size=4608 model.n_layer=24 model.n_head=16 model.n_embd=768 model.dropout=0 scale_augment=True shift_augment=True num_workers=24 experiment=bl4608-GPT2_m24-16-768-0_b8x8x8_FT04379243 use_smoothed_loss=True num_tokens=131 vq_resume=<path_to_vocabulary_ckpt> padding=0 num_val_samples=4 ft_category=04379243 ft_resume=<path_to_base_transformer_ckpt> warmup_steps=100
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers by Mohd Yawar Nihal Siddiqui is licensed under Automotive Development Public Non-Commercial License Version 1.0, however portions of the project are available under separate license terms: e.g. NanoGPT code is under MIT license.
If you wish to cite us, please use the following BibTeX entry:
@InProceedings{siddiqui_meshgpt_2024,
title={MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers},
author={Siddiqui, Yawar and Alliegro, Antonio and Artemov, Alexey and Tommasi, Tatiana and Sirigatti, Daniele and Rosov, Vladislav and Dai, Angela and Nie{\ss}ner, Matthias},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
}