PGSR

Planar-based Gaussian Splatting Reconstruction representation for efficient and high-fidelity surface reconstruction from multi-view RGB images without any geometric prior (depth or normal from pre-trained model).

Web: https://zju3dv.github.io/pgsr/
Paper: PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction
Authors: Danpeng Chen, Hai Li, Weicai Ye, Yifan Wang, Weijian Xie, Shangjin Zhai, Nan Wang, Haomin Liu, Hujun Bao, Guofeng Zhang

Mip-NeRF 360

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
garden 27.44 0.870 0.116 42m 48s 13.71 GB
bicycle 25.28 0.780 0.204 50m 8s 16.71 GB
flowers 21.27 0.618 0.304 42m 10s 14.01 GB
treehill 21.91 0.623 0.322 44m 41s 14.46 GB
stump 26.85 0.787 0.225 40m 42s 12.83 GB
kitchen 30.94 0.923 0.161 36m 2s 13.74 GB
bonsai 31.89 0.941 0.244 33m 23s 14.36 GB
counter 28.64 0.910 0.246 35m 6s 13.87 GB
room 30.58 0.917 0.277 34m 38s 14.85 GB
Average 27.20 0.819 0.233 39m 58s 14.28 GB

Blender

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Scene PSNR SSIM LPIPS (VGG) Time GPU mem.
lego 35.12 0.980 0.023 8m 8s 3.95 GB
drums 26.08 0.953 0.045 8m 19s 4.01 GB
ficus 34.75 0.987 0.013 7m 18s 3.86 GB
hotdog 37.42 0.984 0.031 8m 43s 3.84 GB
materials 29.93 0.959 0.044 7m 57s 3.96 GB
mic 35.66 0.992 0.008 7m 52s 3.93 GB
ship 31.44 0.906 0.129 10m 27s 4.03 GB
chair 35.79 0.987 0.017 7m 54s 3.94 GB
Average 33.27 0.968 0.039 8m 20s 3.94 GB

Tanks and Temples

Tanks and Temples is a benchmark for image-based 3D reconstruction. The benchmark sequences were acquired outside the lab, in realistic conditions. Ground-truth data was captured using an industrial laser scanner. The benchmark includes both outdoor scenes and indoor environments. The dataset is split into three subsets: training, intermediate, and advanced.

Scene PSNR SSIM LPIPS Time GPU mem.
auditorium 23.65 0.868 0.198 16m 56s 8.54 GB
ballroom 23.69 0.817 0.099 22m 29s 10.59 GB
courtroom 21.50 0.783 0.156 24m 60s 11.56 GB
museum 20.59 0.759 0.151 31m 10s 14.12 GB
palace 19.45 0.736 0.291 14m 46s 11.51 GB
temple 21.28 0.813 0.172 16m 24s 8.49 GB
family 24.68 0.861 0.090 20m 33s 7.49 GB
francis 24.84 0.897 0.158 14m 58s 7.97 GB
horse 24.05 0.874 0.097 16m 59s 6.25 GB
lighthouse 21.64 0.843 0.137 15m 11s 8.53 GB
m60 27.61 0.905 0.098 17m 38s 9.14 GB
panther 28.34 0.908 0.102 16m 59s 9.24 GB
playground 24.56 0.873 0.122 20m 28s 10.22 GB
train 20.67 0.797 0.141 16m 53s 9.16 GB
barn 27.34 0.871 0.121 16m 7s 10.24 GB
caterpillar 22.14 0.790 0.173 17m 16s 10.48 GB
church 20.89 0.795 0.154 22m 38s 13.84 GB
courthouse 21.49 0.785 0.228 15m 8s 21.75 GB
ignatius 21.03 0.772 0.148 22m 40s 10.59 GB
meetingroom 23.82 0.857 0.132 17m 54s 10.39 GB
truck 24.13 0.857 0.096 20m 25s 9.21 GB
Average 23.21 0.832 0.146 18m 59s 10.44 GB