Scaffold-GS

Scaffold-GS uses anchor points to distribute local 3D Gaussians, and predicts their attributes on-the-fly based on viewing direction and distance within the view frustum. In NerfBaselines, we fixed bug with cx,cy, added appearance embedding optimization, and added support for sampling masks. Note, that we also implement a demo for the method, but it does not evaluate the MLP and the Gaussians are "baked" for specific viewing direction and appearance embedding (if enabled).

Web: https://city-super.github.io/scaffold-gs/
Paper: Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
Authors: Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua LinBo Dai

Mip-NeRF 360

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden 27.50
Paper's PSNR: 27.17
0.863
Paper's SSIM: 0.842
0.136 21m 46s 7.91 GB
bicycle 25.19
Paper's PSNR: 24.5
0.759
Paper's SSIM: 0.705
0.259 20m 38s 7.37 GB
flowers 21.44 0.592 0.382 20m 23s 6.90 GB
treehill 23.15 0.640 0.373 20m 17s 6.48 GB
stump 26.59
Paper's PSNR: 26.27
0.766
Paper's SSIM: 0.784
0.277 17m 1s 6.22 GB
kitchen 31.59
Paper's PSNR: 31.3
0.927
Paper's SSIM: 0.928
0.156 33m 25s 10.97 GB
bonsai 32.58
Paper's PSNR: 32.7
0.943
Paper's SSIM: 0.946
0.249 24m 33s 10.94 GB
counter 29.48
Paper's PSNR: 29.34
0.910
Paper's SSIM: 0.914
0.256 28m 34s 10.38 GB
room 31.89
Paper's PSNR: 31.93
0.922
Paper's SSIM: 0.925
0.275 24m 37s 11.28 GB
Average 27.71 0.813 0.262 23m 28s 8.72 GB

Blender

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
lego 34.96
Paper's PSNR: 35.69
0.980 0.024 6m 40s 3.72 GB
drums 26.28
Paper's PSNR: 26.44
0.949 0.054 6m 46s 3.73 GB
ficus 34.38
Paper's PSNR: 35.21
0.986 0.015 6m 9s 3.69 GB
hotdog 37.62
Paper's PSNR: 37.73
0.983 0.075 8m 26s 3.71 GB
materials 30.29
Paper's PSNR: 30.65
0.962 0.045 6m 49s 3.73 GB
mic 35.99
Paper's PSNR: 37.25
0.991 0.010 6m 31s 3.71 GB
ship 29.97
Paper's PSNR: 31.17
0.895 0.139 8m 30s 3.74 GB
chair 35.16
Paper's PSNR: 35.28
0.984 0.019 6m 38s 3.71 GB
Average 33.08
Paper's PSNR: 33.68
0.966 0.048 7m 4s 3.72 GB

Photo Tourism

Photo Tourism is a dataset of images of famous landmarks, such as the Sacre Coeur, the Trevi Fountain, and the Brandenburg Gate. The images were captured by tourist at different times of the day and year, images have varying lighting conditions and occlusions. The evaluation protocol is based on NeRF-W, where the image appearance embeddings are optimized on the left side of the image and the metrics are computed on the right side of the image.

Scene PSNR SSIM LPIPS Time GPU mem.
Sacre Coeur 21.85 0.871 0.157 1h 38m 45s 11.38 GB
Trevi Fountain 23.21 0.768 0.228 1h 21m 34s 21.84 GB
Brandenburg Gate 25.45 0.923 0.127 1h 23m 7s 21.80 GB
Average 23.50 0.854 0.170 1h 27m 49s 18.34 GB