Datasets¶

Blender¶

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Authors:: Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
Paper:: https://arxiv.org/pdf/2003.08934.pdf
Web:: https://www.matthewtancik.com/nerf
ID:: Blender
Evaluation protocol:: nerf (source code)

Blender (nerf-synthetic) is a synthetic dataset used to benchmark NeRF methods. It consists of 8 scenes of an object placed on a white background. Cameras are placed on a semi-sphere around the object. Scenes are licensed under various CC licenses.

Hierarchical 3DGS¶

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

Authors:: Bernhard Kerbl, Andreas Meuleman, Georgios Kopanas, Michael Wimmer, Alexandre Lanvin, George Drettakis
Paper:: https://repo-sam.inria.fr/fungraph/hierarchical-3d-gaussians/hierarchical-3d-gaussians_low.pdf
Web:: https://repo-sam.inria.fr/fungraph/hierarchical-3d-gaussians/
ID:: Hierarchical 3DGS
Evaluation protocol:: nerf (source code)

Hierarchical 3DGS is a dataset released with H3DGS paper. We implement the two public single-chunks scenes (SmallCity, Campus) used for evaluation. To collect the dataset, authors used a bicycle helmet on which they mounted 6 GoPro HERO6 Black cameras (5 for the Campus scene). They collected SmallCity and BigCity captures on a bicycle, riding at around 6–7km/h, while Campus was captured on foot wearing the helmet. Poses were estimated using COLMAP with custom parameters and hierarchical mapper. Additinal per-chunk bundle adjustment was performed. It is recommended to use exposure modeling with this dataset

LLFF¶

LLFF: A Large-Scale, Long-Form Video Dataset for 3D Scene Understanding

Authors:: Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, Abhishek Kar
Paper:: https://arxiv.org/pdf/1905.00889.pdf
Web:: https://bmild.github.io/llff/
ID:: LLFF
Evaluation protocol:: nerf (source code)

LLFF is a dataset of forward-facing scenes with a small variation in camera pose. NeRF methods usually use NDC-space parametrization for the scene representation.

Mip-NeRF 360¶

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields

Authors:: Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, Peter Hedman
Paper:: https://arxiv.org/pdf/2111.12077.pdf
Web:: https://jonbarron.info/mipnerf360/
ID:: Mip-NeRF 360
Evaluation protocol:: nerf (source code)

Mip-NeRF 360 is a collection of four indoor and five outdoor object-centric scenes. The camera trajectory is an orbit around the object with fixed elevation and radius. The test set takes each n-th frame of the trajectory as test views.

Mip-NeRF 360 Sparse¶

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields

Authors:: Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, Peter Hedman
Paper:: https://arxiv.org/pdf/2111.12077.pdf
Web:: https://jonbarron.info/mipnerf360/
ID:: Mip-NeRF 360 Sparse
Evaluation protocol:: nerf (source code)

Modified Mip-NeRF 360 dataset with small train set (12 or 24) views. The dataset is used to evaluate sparse-view NVS methods.

Nerfstudio¶

Nerfstudio: A Modular Framework for Neural Radiance Field Development

Authors:: Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristoffersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, Angjoo Kanazawa
Paper:: https://arxiv.org/pdf/2302.04264.pdf
Web:: https://nerf.studio
ID:: Nerfstudio
Evaluation protocol:: default (source code)

Nerfstudio Dataset includes 10 in-the-wild captures obtained using either a mobile phone or a mirror-less camera with a fisheye lens. We processed the data using either COLMAP or the Polycam app to obtain camera poses and intrinsic parameters.

Photo Tourism¶

Photo Tourism: Exploring Photo Collections in 3D

Authors:: Noah Snavely, Steven M. Seitz, Richard Szeliski
Paper:: https://phototour.cs.washington.edu/Photo_Tourism.pdf
Web:: https://phototour.cs.washington.edu/
ID:: Photo Tourism
Evaluation protocol:: nerfw (source code)

Photo Tourism is a dataset of images of famous landmarks, such as the Sacre Coeur, the Trevi Fountain, and the Brandenburg Gate. The images were captured by tourist at different times of the day and year, images have varying lighting conditions and occlusions. The evaluation protocol is based on NeRF-W, where the image appearance embeddings are optimized on the left side of the image and the metrics are computed on the right side of the image.

SeaThru-NeRF¶

SeaThru-NeRF: Neural Radiance Fields in Scattering Media

Authors:: Deborah Levy, Amit Peleg, Naama Pearl, Dan Rosenbaum, Derya Akkaynak, Tali Treibitz, Simon Korman
Paper:: https://openaccess.thecvf.com/content/CVPR2023/papers/Levy_SeaThru-NeRF_Neural_Radiance_Fields_in_Scattering_Media_CVPR_2023_paper.pdf
Web:: https://sea-thru-nerf.github.io/
Licenses:: Apache 2.0
ID:: SeaThru-NeRF
Evaluation protocol:: default (source code)

SeaThru-NeRF dataset contains four underwater forward-facing scenes.

Tanks and Temples¶

Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction

Authors:: Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, Vladlen Koltun
Paper:: https://storage.googleapis.com/t2-downloads/paper/tanks-and-temples.pdf
Web:: https://www.tanksandtemples.org/
ID:: Tanks and Temples
Evaluation protocol:: default (source code)

Tanks and Temples is a benchmark for image-based 3D reconstruction. The benchmark sequences were acquired outside the lab, in realistic conditions. Ground-truth data was captured using an industrial laser scanner. The benchmark includes both outdoor scenes and indoor environments. The dataset is split into three subsets: training, intermediate, and advanced.

Zip-NeRF¶

Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields

Authors:: Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, Peter Hedman
Paper:: https://arxiv.org/pdf/2304.06706.pdf
Web:: https://jonbarron.info/zipnerf/
ID:: Zip-NeRF
Evaluation protocol:: nerf (source code)

ZipNeRF is a dataset with four large scenes: Berlin, Alameda, London, and NYC, (1000-2000 photos each) captured using fisheye cameras. This implementation uses undistorted images which are provided with the dataset and the downsampled resolutions are between 1392 × 793 and 2000 × 1140 depending on scene. It is recommended to use exposure modeling with this dataset if available.