A pre-trained single-image reconstruction model produces noisy Gaussian parameters that serve as the optimization target.

Method	5f PSNR	5f SSIM	5f LPIPS	10f PSNR	10f SSIM	10f LPIPS	Wide PSNR	Wide SSIM	Wide LPIPS
MINE	28.45	0.897	0.111	25.89	0.850	0.150	24.75	0.820	0.179
Flash3D	28.46	0.899	0.100	25.94	0.857	0.133	24.93	0.833	0.160
UAR-Scenes	28.67	0.902	0.095	26.54	0.861	0.112	27.81	0.887	0.107

Method	Int. PSNR	Int. SSIM	Int. LPIPS	Ext. PSNR	Ext. SSIM	Ext. LPIPS	Ext. FID
PixelNeRF	24.00	0.589	0.550	20.05	0.575	0.567	160.77
Du et al.	24.78	0.820	0.410	21.23	0.760	0.480	14.34
pixelSplat	25.49	0.794	0.291	22.62	0.777	0.216	5.78
latentSplat	25.53	0.853	0.280	23.45	0.801	0.190	2.97
MVSplat	26.39	0.869	0.128	24.04	0.812	0.185	3.87
Flash3D	23.87	0.811	0.185	24.10	0.815	0.185	4.02
UAR-Scenes	26.37	0.871	0.125	24.37	0.819	0.144	2.55

Method	PSNR	SSIM	LPIPS
LDI	16.50	0.572	-
SV-MPI	19.50	0.733	-
BTS	20.10	0.761	0.144
MINE	21.90	0.828	0.112
Flash3D	21.96	0.826	0.132
UAR-Scenes	22.31	0.844	0.128

Model	LVDM	FST	Uncertainty	PSNR	SSIM	LPIPS
Baseline	×	×	×	24.93	0.833	0.160
Baseline + LVDM	✓	×	×	27.24	0.867	0.126
Baseline + LVDM-FST	✓	✓	×	27.33	0.869	0.119
UAR-Scenes	✓	✓	✓	27.81	0.887	0.107

Uncertainty-Aware Diffusion-Guided Refinement of 3D Scenes

Video Presentation