Skip to content

Gsplat SOG v2 file format proposal #38

@slimbuck

Description

@slimbuck

Introduction

PlayCanvas added support for SOG gaussian splat scenes thanks to the work of @vincentwoo and @w-m (see https://github.com/playcanvas/sogs). The SOG format is an image-based encoding of gaussian splat data.

We are now updating this format to better match our runtime requirements:

  • rather than using PLAS to order data, we order by morton code instead
    • this results in roughly similar compression by webp
    • it means we don't have to calculate morton code and reorder data at load time
  • improve data robustness to cover a wider spectrum of scenes (see SOGS spherical harmonics #20)
    • introduce a codebook for scales, sh0 and shN
    • simplify meta.json layout
    • introduce a single-file, bundled binary format which is more convenient to use compared to separate files

Since this is an evolution on the SOG format, we keep the name, but consider them Spatially Ordered Gaussians.

A SOG scene comprises multiple files:

  • meta.json: contains the metadata and the names of the linked webp images
  • images: images containing the packed gaussian data, by default in webp format
    • means_u, means_l: upper and lower bits of position
    • scales: sizes
    • quats: orientation/quaternions
    • sh0: color and opacity
    • shN_centroids, shN_labels: (optional) palette of spherical harmonics and their per-gaussian labels

All images have the same dimension apart from shN_centroids.

Metadata v2 format

interface SogMeta {
    version: 2;            // file format version
    count: number;         // number of splats (pixels encoded)
    antialias: boolean,

    means: {
        // Per-axis affine ranges used to (de)normalize log-transformed positions.
        // Values are AFTER logTransform(value) = sign(v)*log(|v|+1).
        mins: [number, number, number];   // [xmin', ymin', zmin']
        maxs: [number, number, number];   // [xmax', ymax', zmax']
        files: ["means_l.webp", "means_u.webp"];
    };

    scales: {
        // Scale codebook for k=256 k-means over [scale_0, scale_1, scale_2]
        codebook: number[];  // length 256
        files: ["scales.webp"]; // per-splat byte labels in RGB
    };

    quats: {
        // Orientation texture only; ranges are implicit in encoding.
        files: ["quats.webp"];
    };

    sh0: {
        // DC color codebook for k=256 k-means over [f_dc_0, f_dc_1, f_dc_2].
        codebook: number[];  // length 256
        files: ["sh0.webp"]; // per-splat byte labels in RGB; A = opacity (sigmoid*255)
    };

    // Present only if higher-order SH (f_rest_*) exists.
    shN?: {
        // A 256-entry 1D codebook built over the SH centroids.
        codebook: number[];  // length 256
        files: ["shN_centroids.webp", "shN_labels.webp"];
        // - shN_centroids.webp packs centroid coefficient triplets per coeff in RGBA.
        // - shN_labels.webp stores per-splat palette indices (uint16 little-endian in RG).
    };
}

Data layout

A summary of the image data is as follows:

 ┌───────────────────┐
 │ means_l.webp      │   Position (log-space, normalized, 16-bit split)
 │───────────────────│
 │ R = x low byte    │
 │ G = y low byte    │
 │ B = z low byte    │
 │ A = 255           │
 └───────────────────┘

 ┌───────────────────┐
 │ means_u.webp      │
 │───────────────────│
 │ R = x high byte   │
 │ G = y high byte   │
 │ B = z high byte   │
 │ A = 255           │
 └───────────────────┘

 ┌───────────────────┐
 │ quats.webp        │   Orientation (compressed quaternion)
 │───────────────────│
 │ R = comp0         │
 │ G = comp1         │
 │ B = comp2         │
 │ A = 252 + index   │   (which component was dropped)
 └───────────────────┘

 ┌───────────────────┐
 │ scales.webp       │   Log-scales
 │───────────────────│
 │ R = scale_x       │   (cluster index, 0–255)
 │ G = scale_y       │
 │ B = scale_z       │
 │ A = 255           │
 └───────────────────┘

 ┌───────────────────┐
 │ sh0.webp          │   Color + opacity
 │───────────────────│
 │ R = f_dc_0 label  │   (cluster index, 0–255)
 │ G = f_dc_1 label  │
 │ B = f_dc_2 label  │
 │ A = opacity (0–255, sigmoid mapped)
 └───────────────────┘

 ┌─────────────────────────┐
 │ shN_centroids.webp      │   (only if higher-order SH present)
 │─────────────────────────│
 │ R,G,B = centroid coeffs │
 │ A = 255                 │
 └─────────────────────────┘

 ┌───────────────────┐
 │ shN_labels.webp   │   Per-splat SH palette index
 │───────────────────│
 │ R = low byte      │
 │ G = high byte     │
 │ B = 0             │
 │ A = 255           │
 └───────────────────┘

Bundled format

It can be inconvenient working with multiple files so we support a bundled version of SOG data.

A bundled scene has a .sog extension and is a simple zip archive (i.e. uncompressed zip) of the constituent files.

The bundle loader uses the following logic when resolving filenames appearing in meta.json:

  • if the file is present in the bundle, use it
  • otherwise consider the filename a URL and resolve it relative to current href

This will allow bundles to reference resources external to the bundle.

Many thanks to @nobbis for suggesting we use a zip archive instead of custom binary format (see discussion below).

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions