In this article, we propose a novel approach to novel-view synthesis that leverages both diffusion and reflection to create photorealistic images from arbitrary viewpoints. Unlike traditional methods that rely solely on diffusion or reflection, our approach jointly optimizes these two components to produce more accurate and visually appealing results.
To achieve this, we introduce a new loss function that combines the L1 reconstruction loss with multiple additional terms. These include the density proposal networks’ supervision (Lprop), the distortion loss encouraging density sparsity (Ldist), and the normal prediction loss guiding the predicted normals (Lnorm). By optimizing all these components together, we can better control the diffusion and reflection processes to create more realistic images.
To illustrate our approach’s effectiveness, we conduct comprehensive experiments on three datasets: Eyeful Tower, NISR, and Shiny. Our results show that our method outperforms existing methods in terms of both objective metrics (SSIM, PSNR) and visual quality. Particularly noteworthy is the improvement in normal map visualizations, as our approach produces more accurate decompositions and normals than Ref-NeRF [41].
By leveraging both diffusion and reflection through joint optimization, we can create novel-view images that are more accurate and visually appealing than those produced by traditional methods. Our approach has important implications for a variety of applications, including virtual reality, augmented reality, and computer graphics.
Computer Science, Computer Vision and Pattern Recognition