Yunxiang Zhang1*, Bingxuan Li1*, Alexandr Kuznetsov3†, Akshay Jindal2, Stavros Diolatzis2, Kenneth Chen1, Anton Sochenov2, Anton Kaplanyan2, Qi Sun1
* Equal contribution † Work done while at Intel
Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications.
Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.
Figure 1: Image-GS reconstructs an image by adaptively allocating and progressively optimizing a set of colored 2D Gaussians. It achieves favorable rate-distortion trade-offs, hardware-friendly random access, and flexible quality control through a smooth level-of-detail stack. (a) visualizes the optimized spatial distribution of Gaussians (20% randomly sampled for clarity). (b) Image-GS’s explicit content-adaptive design effectively captures non-uniformly distributed image features and better preserves fine details under constrained memory budgets. In the inset error maps, brighter colors indicate larger errors.
- Create a dedicated Python environment and install the dependencies
git clone https://github.com/NYU-ICL/image-gs.git cd image-gs conda env create -f environment.yml conda activate image-gs pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation cd gsplat pip install -e ".[dev]" cd ..
- Download the image and texture datasets from OneDrive and organize the folder structure as follows
image-gs └── media ├── images └── textures - (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained EML-Net models (res_imagenet.pth, res_places.pth, res_decoder.pth) and place them under the
models/emlnet/folderimage-gs └── models └── emlnet ├── res_decoder.pth ├── res_imagenet.pth └── res_places.pth
- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with half-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize- Render the corresponding optimized Image-GS representation at a new resolution with height
4000(aspect ratio is maintained)
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000- Optimize an Image-GS representation for an input texture stack
alarm-clock_2kusing30000Gaussians with half-precision parameters
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize- Render the corresponding optimized Image-GS representation at a new resolution with height
3000(aspect ratio is maintained)
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with 12-bit-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with half-precision parameters and saliency-guided initialization
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"Please refer to cfgs/default.yaml for the full list of arguments and their default values.
Post-optimization rendering
--evalrender the optimized Image-GS representation.--render_heightimage height for rendering (aspect ratio is maintained).
Bit precision control: 32 bits (float32) per dimension by default
--quantizeenable bit precision control of Gaussian parameters.--pos_bitsbit precision of individual coordinate dimension.--scale_bitsbit precision of individual scale dimension.--rot_bitsbit precision of Gaussian orientation angle.--feat_bitsbit precision of individual feature dimension.
Logging
--exp_namepath to the logging directory.--vis_gaussians: visualize Gaussians during optimization.--save_image_stepsfrequency of rendering intermediate results during optimization.--save_ckpt_stepsfrequency of checkpointing during optimization.
Input image
--input_pathpath to an image file or a directory containing a texture stack.--downsampleload a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.--downsample_ratiodownsampling ratio.--gammaoptimize in a gamma-corrected space, modify with caution.
Gaussian
--num_gaussiansnumber of Gaussians (for compression rate control).--init_scaleinitial Gaussian scale in number of pixels.--disable_topk_normdisable top-K normalization.--disable_inverse_scaledisable inverse Gaussian scale optimization.--init_modeGaussian position initialization mode, valid values include "gradient", "saliency", and "random".--init_random_ratioratio of Gaussians with randomly initialized position.
Optimization
--disable_tilesdisable tile-based rendering (warning: optimization and rendering without tiles will be way slower).--max_stepsmaximum number of optimization steps.--pos_lrGaussian position learning rate.--scale_lrGaussian scale learning rate.--rot_lrGaussian orientation angle learning rate.--feat_lrGaussian feature learning rate.--disable_lr_scheduledisable learning rate decay and early stopping schedule.--disable_prog_optimdisable error-guided progressive optimization.
We would like to thank the gsplat team, as well as the authors of 3D Gaussian Splatting, Fused SSIM, and EML-Net for their outstanding work, which provided an important foundation for the development of Image-GS. We also thank Károly Zsolnai-Fehér for featuring our work on Two Minute Papers and introducing it to a broader audience, and Shubham Anand for creating a clear, thorough, and well-explained tutorial on our work on LearnOpenCV.
This project is licensed under the terms of the MIT license.
If you find this project helpful to your research, please consider citing BibTeX:
@inproceedings{zhang2025image,
title={Image-gs: Content-adaptive image representation via 2d gaussians},
author={Zhang, Yunxiang and Li, Bingxuan and Kuznetsov, Alexandr and Jindal, Akshay and Diolatzis, Stavros and Chen, Kenneth and Sochenov, Anton and Kaplanyan, Anton and Sun, Qi},
booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
pages={1--11},
year={2025}
}

