Zip-NeRF

ZipNeRF is a dataset with four large scenes: Berlin, Alameda, London, and NYC, (1000-2000 photos each) captured using fisheye cameras. This implementation uses undistorted images which are provided with the dataset and the downsampled resolutions are between 1392 × 793 and 2000 × 1140 depending on scene. It is recommended to use exposure modeling with this dataset if available.

Method PSNR SSIM LPIPS (VGG) Time GPU mem.
Octree-GS 22.33 0.762 0.473 44m 56s 27.63 GB
Alameda 22.79 0.730 0.448 44m 25s 28.68 GB
Berlin 13.64 0.669 0.640 1h 6m 40s 34.85 GB
London 25.76 0.807 0.433 33m 44s 27.35 GB
NYC 27.13 0.841 0.372 34m 55s 19.63 GB

PSNR

Peak Signal to Noise Ratio. The higher the better.

Method Alameda Berlin London NYC
Octree-GS 22.79 13.64 25.76 27.13

SSIM

Structural Similarity Index. The higher the better. The implementation matches JAX's SSIM and torchmetrics's SSIM (with default parameters).

Method Alameda Berlin London NYC
Octree-GS 0.730 0.669 0.807 0.841

LPIPS (VGG)

Learned Perceptual Image Patch Similarity. The lower the better. The implementation uses VGG backbone and matches lpips pip package with checkpoint version 0.1

Method Alameda Berlin London NYC
Octree-GS 0.448 0.640 0.433 0.372