Photo Tourism

Photo Tourism is a dataset of images of famous landmarks, such as the Sacre Coeur, the Trevi Fountain, and the Brandenburg Gate. The images were captured by tourist at different times of the day and year, images have varying lighting conditions and occlusions. The evaluation protocol is based on NeRF-W, where the image appearance embeddings are optimized on the left side of the image and the metrics are computed on the right side of the image.

Method PSNR SSIM LPIPS Time GPU mem.
K-Planes 21.10 0.761 0.313 24m 37s 3.59 GB
Sacre Coeur 19.96 0.762 0.299 24m 26s 3.62 GB
Trevi Fountain 19.70 0.662 0.388 24m 44s 3.59 GB
Brandenburg Gate 23.65 0.859 0.253 24m 40s 3.56 GB
GS-W 21.38
Paper's PSNR: 24.70

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.865

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.124

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

1h 13m 50s 21.93 GB
Sacre Coeur 19.73
Paper's PSNR: 23.24

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.8632

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.13

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

1h 8m 56s 18.80 GB
Trevi Fountain 20.06
Paper's PSNR: 22.91

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.8014

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.1563

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

1h 21m 34s 27.92 GB
Brandenburg Gate 24.35
Paper's PSNR: 27.96

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.9319

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.0862

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

1h 10m 60s 19.06 GB
NeRF-W (reimplementation) 21.75 0.790 0.268 44h 23m 46s 98.80 GB
Sacre Coeur 19.56 0.795 0.260 41h 41m 36s 98.80 GB
Trevi Fountain 21.48 0.693 0.331 49h 43m 22s 98.81 GB
Brandenburg Gate 24.22 0.884 0.213 41h 46m 21s 98.80 GB
Scaffold-GS 23.50 0.854 0.170 1h 27m 49s 18.34 GB
Sacre Coeur 21.85 0.871 0.157 1h 38m 45s 11.38 GB
Trevi Fountain 23.21 0.768 0.228 1h 21m 34s 21.84 GB
Brandenburg Gate 25.45 0.923 0.127 1h 23m 7s 21.80 GB
gsplat 23.66 0.857 0.162 1h 44m 24s 4.68 GB
Sacre Coeur 22.05 0.876 0.154 1h 30m 15s 3.33 GB
Trevi Fountain 22.56 0.765 0.213 2h 11m 60s 7.05 GB
Brandenburg Gate 26.36 0.931 0.118 1h 30m 56s 3.68 GB
WildGaussians 24.65 0.851 0.179 10h 18m 16s 18.24 GB
Sacre Coeur 22.56 0.859 0.177 8h 41m 57s 8.66 GB
Trevi Fountain 23.63 0.766 0.228 13h 14m 51s 38.25 GB
Brandenburg Gate 27.76 0.927 0.133 8h 57m 60s 7.83 GB


Peak Signal to Noise Ratio. The higher the better.

Method Sacre Coeur Trevi Fountain Brandenburg Gate
K-Planes 19.96 19.70 23.65
GS-W 19.73
Paper's PSNR: 23.24

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's PSNR: 22.91

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's PSNR: 27.96

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

NeRF-W (reimplementation) 19.56 21.48 24.22
Scaffold-GS 21.85 23.21 25.45
gsplat 22.05 22.56 26.36
WildGaussians 22.56 23.63 27.76


Structural Similarity Index. The higher the better. The implementation matches JAX's SSIM and torchmetrics's SSIM (with default parameters).

Method Sacre Coeur Trevi Fountain Brandenburg Gate
K-Planes 0.762 0.662 0.859
GS-W 0.824
Paper's SSIM: 0.8632

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.8014

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's SSIM: 0.9319

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

NeRF-W (reimplementation) 0.795 0.693 0.884
Scaffold-GS 0.871 0.768 0.923
gsplat 0.876 0.765 0.931
WildGaussians 0.859 0.766 0.927


Learned Perceptual Image Patch Similarity. The lower the better. The implementation uses AlexNet backbone and matches lpips pip package with checkpoint version 0.1

Method Sacre Coeur Trevi Fountain Brandenburg Gate
K-Planes 0.299 0.388 0.253
GS-W 0.210
Paper's LPIPS: 0.13

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.1563

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

Paper's LPIPS: 0.0862

The original paper reports metrics for test images where the appearance embedding is estimated from the full test image, not just the left half as in the official evaluation protocol. The reported numbers are computed using the official evaluation protocol and are, therefore, lower than the numbers reported in the paper.

NeRF-W (reimplementation) 0.260 0.331 0.213
Scaffold-GS 0.157 0.228 0.127
gsplat 0.154 0.213 0.118
WildGaussians 0.177 0.228 0.133