gsplat

gsplat is an open-source library for CUDA accelerated rasterization of gaussians with python bindings. It is inspired by the 3DGS paper, but it is faster, more memory efficient, and with a growing list of new features. In NerfBaselines, the method was modified to enable appearance optimization, to support sampling masks, and to support setting background color (which is required for the Blender dataset).

Web: https://docs.gsplat.studio/main/
Paper: gsplat: An Open-Source Library for Gaussian Splatting
Authors: Vickie Ye, Ruilong Li, Justin Kerr, Matias Turkulainen, Brent Yi, Zhuoyang Pan, Otto Seiskari, Jianbo Ye, Jeffrey Hu, Matthew Tancik, Angjoo Kanazawa

Mip-NeRF 360

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden 27.49
Paper's PSNR: 27.39
0.868
Paper's SSIM: 0.87
0.120 39m 56s 12.52 GB
bicycle 25.26
Paper's PSNR: 25.29
0.766
Paper's SSIM: 0.77
0.237 35m 41s 11.75 GB
flowers 21.59 0.603 0.368 26m 2s 8.64 GB
treehill 22.57 0.635 0.377 24m 28s 7.79 GB
stump 26.57
Paper's PSNR: 26.51
0.772
Paper's SSIM: 0.77
0.248 26m 40s 9.63 GB
kitchen 30.86
Paper's PSNR: 31.37
0.923
Paper's SSIM: 0.93
0.158 30m 54s 6.79 GB
bonsai 32.13
Paper's PSNR: 32.21
0.941
Paper's SSIM: 0.94
0.254 23m 13s 5.69 GB
counter 28.94
Paper's PSNR: 29.01
0.906
Paper's SSIM: 0.91
0.257 28m 24s 5.79 GB
room 31.30
Paper's PSNR: 31.23
0.916
Paper's SSIM: 0.92
0.286 28m 30s 6.11 GB
Average 27.41 0.815 0.256 29m 19s 8.30 GB

Blender

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
lego 32.13 0.977 0.043 13m 31s 2.94 GB
drums 24.50 0.948 0.076 11m 15s 2.74 GB
ficus 34.44 0.986 0.014 11m 48s 2.75 GB
hotdog 35.39 0.983 0.041 20m 32s 2.67 GB
materials 29.93 0.959 0.044 11m 37s 2.76 GB
mic 33.73 0.991 0.023 11m 30s 2.84 GB
ship 28.77 0.899 0.155 17m 56s 2.86 GB
chair 32.88 0.986 0.034 19m 53s 2.82 GB
Average 31.47 0.966 0.054 14m 45s 2.80 GB

Photo Tourism

Photo Tourism is a dataset of images of famous landmarks, such as the Sacre Coeur, the Trevi Fountain, and the Brandenburg Gate. The images were captured by tourist at different times of the day and year, images have varying lighting conditions and occlusions. The evaluation protocol is based on NeRF-W, where the image appearance embeddings are optimized on the left side of the image and the metrics are computed on the right side of the image.

Scene PSNR SSIM LPIPS Time GPU mem.
Sacre Coeur 22.05 0.876 0.154 1h 30m 15s 3.33 GB
Trevi Fountain 22.56 0.765 0.213 2h 11m 60s 7.05 GB
Brandenburg Gate 26.36 0.931 0.118 1h 30m 56s 3.68 GB
Average 23.66 0.857 0.162 1h 44m 24s 4.68 GB