Scaffold-GS

Scaffold-GS uses anchor points to distribute local 3D Gaussians, and predicts their attributes on-the-fly based on viewing direction and distance within the view frustum. In NerfBaselines, we fixed bug with cx,cy, added appearance embedding optimization, and added support for masks. Note, that we also implement a demo for the method, but it does not evaluate the MLP and the Gaussians are "baked" for specific viewing direction and appearance embedding (if enabled).

Web: https://city-super.github.io/scaffold-gs/
Paper: Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
Authors: Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua LinBo Dai

Mip-NeRF 360

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden 27.50
Paper's PSNR: 27.17
0.863
Paper's SSIM: 0.842
0.136 21m 46s 7.91 GB
bicycle 25.19
Paper's PSNR: 24.5
0.759
Paper's SSIM: 0.705
0.259 20m 38s 7.37 GB
flowers 21.44 0.592 0.382 20m 23s 6.90 GB
treehill 23.15 0.640 0.373 20m 17s 6.48 GB
stump 26.59
Paper's PSNR: 26.27
0.766
Paper's SSIM: 0.784
0.277 17m 1s 6.22 GB
kitchen 31.59
Paper's PSNR: 31.3
0.927
Paper's SSIM: 0.928
0.156 33m 25s 10.97 GB
bonsai 32.58
Paper's PSNR: 32.7
0.943
Paper's SSIM: 0.946
0.249 24m 33s 10.94 GB
counter 29.48
Paper's PSNR: 29.34
0.910
Paper's SSIM: 0.914
0.256 28m 34s 10.38 GB
room 31.89
Paper's PSNR: 31.93
0.922
Paper's SSIM: 0.925
0.275 24m 37s 11.28 GB
Average 27.71 0.813 0.262 23m 28s 8.72 GB

Blender

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
lego 34.96
Paper's PSNR: 35.69
0.980 0.024 6m 40s 3.72 GB
drums 26.28
Paper's PSNR: 26.44
0.949 0.054 6m 46s 3.73 GB
ficus 34.38
Paper's PSNR: 35.21
0.986 0.015 6m 9s 3.69 GB
hotdog 37.62
Paper's PSNR: 37.73
0.983 0.075 8m 26s 3.71 GB
materials 30.29
Paper's PSNR: 30.65
0.962 0.045 6m 49s 3.73 GB
mic 35.99
Paper's PSNR: 37.25
0.991 0.010 6m 31s 3.71 GB
ship 29.97
Paper's PSNR: 31.17
0.895 0.139 8m 30s 3.74 GB
chair 35.16
Paper's PSNR: 35.28
0.984 0.019 6m 38s 3.71 GB
Average 33.08
Paper's PSNR: 33.68
0.966 0.048 7m 4s 3.72 GB

Photo Tourism

Photo Tourism is a dataset of images of famous landmarks, such as the Sacre Coeur, the Trevi Fountain, and the Brandenburg Gate. The images were captured by tourist at different times of the day and year, images have varying lighting conditions and occlusions. The evaluation protocol is based on NeRF-W, where the image appearance embeddings are optimized on the left side of the image and the metrics are computed on the right side of the image.

Scene PSNR SSIM LPIPS Time GPU mem.
Sacre Coeur 21.85 0.871 0.157 1h 38m 45s 11.38 GB
Trevi Fountain 23.21 0.768 0.228 1h 21m 34s 21.84 GB
Brandenburg Gate 25.45 0.923 0.127 1h 23m 7s 21.80 GB
Average 23.50 0.854 0.170 1h 27m 49s 18.34 GB

Mip-NeRF 360 Sparse

Modified Mip-NeRF 360 dataset with small train set (12 or 24) views. The dataset is used to evaluate sparse-view NVS methods.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden n12 19.68 0.576 0.326 17m 21s 4.65 GB
bicycle n12 18.02 0.363 0.493 16m 60s 4.80 GB
flowers n12 14.52 0.240 0.553 15m 33s 4.73 GB
treehill n12 15.84 0.324 0.552 18m 55s 5.05 GB
stump n12 18.96 0.346 0.538 14m 9s 4.61 GB
kitchen n12 21.91 0.764 0.317 29m 31s 6.12 GB
bonsai n12 19.76 0.741 0.413 21m 55s 6.10 GB
counter n12 18.98 0.661 0.456 23m 7s 5.98 GB
room n12 20.91 0.740 0.415 20m 4s 6.13 GB
garden n24 23.11 0.719 0.236 19m 23s 5.13 GB
bicycle n24 20.29 0.495 0.396 19m 40s 5.66 GB
flowers n24 16.69 0.347 0.493 17m 14s 4.98 GB
treehill n24 18.92 0.453 0.468 19m 26s 5.67 GB
stump n24 20.32 0.426 0.483 17m 35s 5.43 GB
kitchen n24 24.98 0.861 0.217 31m 36s 6.79 GB
bonsai n24 24.37 0.850 0.320 23m 56s 6.29 GB
counter n24 23.04 0.794 0.348 24m 17s 6.12 GB
room n24 24.76 0.821 0.352 21m 38s 6.30 GB
Average 20.28 0.585 0.410 20m 41s 5.59 GB