Official Implementation of "Low-latency Space-time Supersampling for Real-time Rendering" (AAAI 2024).
We use Torch-TensorRT 1.1.0, PyTorch 1.11, CUDA 11.4, cuDNN 8.2 and TensorRT 8.2.5.1.
Please download the corresponding version of CUDA, cuDNN, and TensorRT. Then set the environment variables as follows:
export TRT_RELEASE=~/project/TensorRT-8.2.5.1
export PATH="/usr/local/cuda-11.4/bin:$PATH"
export CUDA_HOME="/usr/local/cuda-11.4"
export LD_LIBRARY_PATH="$TRT_RELEASE/lib:/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH"
Next, create a conda environment and install PyTorch and Torch-TensorRT:
conda create -n tensorrt python=3.7 pytorch=1.11 torchvision torchaudio cudatoolkit=11.3 -c pytorch -y
conda activate tensorrt
pip3 install $TRT_RELEASE/python/tensorrt-8.2.5.1-cp37-none-linux_x86_64.whl
pip3 install torch-tensorrt==1.1.0 -f https://github.com/pytorch/TensorRT/releases/download/v1.1.0/torch_tensorrt-1.1.0-cp37-cp37m-linux_x86_64.whl
pip3 install opencv-python tqdm thop matplotlib scikit-image lpips visdom numpy pytorch_msssim
We release the dataset used in our paper at ModelScope. The dataset contains four scenes, Lewis, SunTemple, Subway, and Arena, each with around 6000 frames for training and 1000 for testing. Every frame is a compressed numpy array with 16-bit float type.
You can install ModelScope by running:
pip install modelscope
Then you can download the dataset by running the following code in Python:
from modelscope.msdatasets import MsDataset
ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='Lewis', split='test')
# ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='SunTemple', split='test')
# ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='Subway', split='test')
# ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='Arena', split='test')
Note that the dataset is around 40GB per test scene. It may take a while to download the dataset.
Please modify the path in dataloaders.py
to your own path before next step.
You can modify the dataset
and mode
in eval.py
to evaluate different scenes and modes.
all
mode means evaluating all the pixels, edge
mode means evaluating the pixels on the canny edge of the HR frame, and hole
mode means evaluating the pixels in warping holes in the LR frame.
Run the following command to evaluate the model for PSNR, SSIM and LPIPS:
python eval.py
To evaluate the VMAF, you need to:
- Set
save_img
toTrue
ineval.py
and run it. - Run generate
utils/video.py
to generategt.avi
andpred.avi
. - Install ffmpeg, and add its path to environment variable "PATH".
- Follow the instructions of VMAF to use ffmpeg to compute VMAF metric between
gt.avi
andpred.avi
.
You can test model size, FLOPs, and the inference speed of our model by running:
python benchmark.py
You should get the following results:
Computational complexity: 31.502G
Number of parameters: 417.241K
Time: 4.350 ms
Inference speed is tested on a single RTX 3090 GPU and may vary on different machines.
You can download the training dataset by running the following code in Python:
from modelscope.msdatasets import MsDataset
ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='Lewis', split='train')
ds = MsDataset.load('ryanhe312/STSSNet-AAAI2024', subset_name='Lewis', split='validation')
It will download two sequence train1 and train2. And you can modify the subset_name
for different scenes (one of 'Lewis', 'SunTemple' and 'Subway'). Each sequence is around 150GB. It may take a while to download the dataset.
Please modify the path in dataloaders.py
to your own path, and run train.py
to train for different scenes.
Visdom is used for visualization. You can run python -m visdom.server
to start a visdom server, and then open http://localhost:8097/
in your browser to see the training process.
We thank the authors of ExtraNet for their great work and data generation pipeline.
If you find our work useful in your research, please consider citing:
@misc{he2023lowlatency,
title={Low-latency Space-time Supersampling for Real-time Rendering},
author={Ruian He and Shili Zhou and Yuqi Sun and Ri Cheng and Weimin Tan and Bo Yan},
year={2023},
eprint={2312.10890},
archivePrefix={arXiv},
primaryClass={cs.CV}
}