NerfStudio

NerfStudio (Nerfacto) is a method based on Instant-NGP which combines several improvements from different papers to achieve good quality on real-world scenes captured under normal conditions. It is fast to train (12 min) and render speed is ~1 FPS.

Web: https://docs.nerf.studio/
Paper: Nerfstudio: A Modular Framework for Neural Radiance Field Development
Authors: Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristoffersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, Angjoo Kanazawa

Mip-NeRF 360

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden 25.89
Paper's PSNR: 26.47
0.752
Paper's SSIM: 0.774
0.254
Paper's LPIPS (VGG): 0.235
19m 31s 5.30 GB
bicycle 23.58
Paper's PSNR: 24.08
0.567
Paper's SSIM: 0.599
0.456
Paper's LPIPS (VGG): 0.422
19m 12s 5.15 GB
flowers 21.16 0.511 0.434 19m 6s 5.22 GB
treehill 22.85 0.549 0.488 19m 31s 5.24 GB
stump 25.81
Paper's PSNR: 24.78
0.697
Paper's SSIM: 0.662
0.353
Paper's LPIPS (VGG): 0.38
18m 53s 5.15 GB
kitchen 29.92
Paper's PSNR: 30.29
0.883
Paper's SSIM: 0.89
0.200
Paper's LPIPS (VGG): 0.19
20m 5s 6.66 GB
bonsai 30.59
Paper's PSNR: 32.16
0.907
Paper's SSIM: 0.933
0.249
Paper's LPIPS (VGG): 0.197
19m 46s 6.66 GB
counter 27.09
Paper's PSNR: 27.2
0.830
Paper's SSIM: 0.843
0.336
Paper's LPIPS (VGG): 0.314
19m 28s 6.66 GB
room 30.61
Paper's PSNR: 30.89
0.880
Paper's SSIM: 0.896
0.315
Paper's LPIPS (VGG): 0.296
19m 53s 6.66 GB
Average 26.39 0.731 0.343 19m 30s 5.86 GB

Blender

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
lego 31.37 0.967 0.069 9m 44s 3.65 GB
drums 22.48 0.897 0.139 9m 12s 3.65 GB
ficus 27.82 0.957 0.087 9m 22s 3.65 GB
hotdog 31.09 0.963 0.104 9m 44s 3.65 GB
materials 25.38 0.903 0.121 9m 23s 3.65 GB
mic 33.74 0.984 0.029 9m 47s 3.65 GB
ship 28.71 0.881 0.166 10m 2s 3.64 GB
chair 32.94 0.977 0.044 9m 48s 3.65 GB
Average 29.19 0.941 0.095 9m 38s 3.65 GB

Tanks and Temples

Tanks and Temples is a benchmark for image-based 3D reconstruction. The benchmark sequences were acquired outside the lab, in realistic conditions. Ground-truth data was captured using an industrial laser scanner. The benchmark includes both outdoor scenes and indoor environments. The dataset is split into three subsets: training, intermediate, and advanced.

Scene PSNR SSIM LPIPS Time GPU mem.
auditorium 20.77 0.771 0.330 19m 46s 3.88 GB
ballroom 22.68 0.705 0.261 19m 44s 3.88 GB
courtroom 20.24 0.673 0.336 19m 17s 3.88 GB
museum 17.84 0.648 0.311 18m 49s 3.88 GB
palace 17.68 0.640 0.452 20m 9s 3.64 GB
temple 17.06 0.678 0.392 19m 37s 3.88 GB
family 24.32 0.822 0.158 18m 54s 3.63 GB
francis 24.60 0.851 0.190 18m 48s 3.63 GB
horse 24.31 0.847 0.139 18m 53s 3.88 GB
lighthouse 20.85 0.768 0.245 19m 33s 3.89 GB
m60 26.54 0.843 0.179 19m 47s 3.64 GB
panther 27.57 0.858 0.174 20m 10s 3.89 GB
playground 24.69 0.755 0.249 19m 33s 3.64 GB
train 20.43 0.693 0.261 19m 42s 3.88 GB
barn 26.40 0.794 0.215 19m 16s 3.63 GB
caterpillar 21.71 0.666 0.302 19m 52s 3.63 GB
church 20.06 0.671 0.338 19m 29s 3.63 GB
courthouse 18.11 0.632 0.465 19m 37s 3.63 GB
ignatius 20.44 0.689 0.251 19m 5s 3.63 GB
meetingroom 23.21 0.793 0.261 19m 3s 3.63 GB
truck 23.37 0.797 0.167 19m 28s 3.63 GB
Average 22.04 0.743 0.270 19m 27s 3.74 GB