Neural Light Transport for Relighting and View Synthesis

Xiuming Zhang 1   Sean Fanello 2   Yun-Ta Tsai 2   Tiancheng Sun 3   Tianfan Xue 2   Rohit Pandey 2   Sergio Orts-Escolano 2   Philip Davidson 2   Christoph Rhemann 2   Paul Debevec 2   Jonathan T. Barron 2   Ravi Ramamoorthi 3   William T. Freeman 1, 2


1 Massachusetts Institute of Technology (MIT) 2 Google 3 University of California, San Diego (UCSD)

Abstract

The light transport (LT) of a scene describes how it appears under different lighting and viewing directions, and complete knowledge of a scene's LT enables the synthesis of novel views under arbitrary lighting. In this paper, we focus on image-based LT acquisition, primarily for human bodies within a light stage setup. We propose a semi-parametric approach to learn a neural representation of LT that is embedded in the space of a texture atlas of known geometric properties, and model all non-diffuse and global LT as residuals added to a physically-accurate diffuse base rendering. In particular, we show how to fuse previously seen observations of illuminants and views to synthesize a new image of the same scene under a desired lighting condition from a chosen viewpoint. This strategy allows the network to learn complex material effects (such as subsurface scattering) and global illumination, while guaranteeing the physical correctness of the diffuse LT (such as hard shadows). With this learned LT, one can relight the scene photorealistically with a directional light or an HDRI map, synthesize novel views with view-dependent effects, or do both simultaneously, all in a unified framework using a set of sparse, previously seen observations. Qualitative and quantitative experiments demonstrate that our neural LT (NLT) outperforms state-of-the-art solutions for relighting and view synthesis, without separate treatment for both problems that prior work requires.

(A) Neural Light Transport (NLT) learns to interpolate the 6D light transport function of a surface as a function of its UV coordinate (2 DOFs), its incident light direction (2 DOFs), and its viewing direction (2 DOFs). (B) The subject is imaged from multiple viewpoints and multiple directional lights; a geometry proxy is also captured using active sensors. (C) Querying the learned function at different light or viewing directions enables simultaneous relighting and view synthesis of the subject. (D) The relit renderings that NLT produces can be combined with HDRI maps to perform image-based relighting.

Paper

Neural Light Transport for Relighting and View Synthesis
Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman
arXiv 2020
Original Resolution & Optimal Layout (60 MB)  /   arXiv  /   BibTeX

BibTeX
@article{zhang2020neural,
    title={Neural Light Transport for Relighting and View Synthesis},
    author={Zhang, Xiuming and Fanello, Sean and Tsai, Yun-Ta and Sun, Tiancheng and Xue, Tianfan and Pandey, Rohit and Orts-Escolano, Sergio and Davidson, Philip and Rhemann, Christoph and Debevec, Paul and Barron, Jonathan T. and Ramamoorthi, Ravi and Freeman, William T.},
    journal={arXiv preprint arXiv:2008.03806},
    year={2020},
}

v2 (Aug. 20, 2020), superseding v1 (Aug. 10, 2020)

Video

Original Resolution (1 GB)  /   720p (187 MB)  /   480p (105 MB)


v2 (Aug. 20, 2020), superseding v1 (Aug. 10, 2020)

Code

This GitHub repo. includes code for:

Downloads

If you simply want to try our trained models, see "Pre-Trained Models." If you want to modify our model and retrain it with our data, see "Metadata" and "Rendered Data" as well as the code above. If you want to render your own data, see "Metadata" and the code above.

Metadata

The metadata .zip (304 KB) contains:

Here are visualizations of all the cameras and lights that we use for the synthetic data:

Rendered Data

As we are adding more synthetic scenes, the currently available ones are:

Each .zip is for one scene, containing the .blend scene file, the image data rendered from 64 camera views, each under 331 lighting conditions, and a status JSON of those image data.

Pre-Trained Models

With a trained model, one can perform simultaneous relighting and view synthesis by querying it at novel camera and light locations. Below are the NLT results for the dragon scenes rendered using the "Test" paths above and the trained models that produced them.

Note: As the camera circles around the object, the light also moves, following excatly the camera path (i.e., we have co-located camera and light during that period). Therefore, although the models seems to be doing only view synthesis as the camera circles around the object, the models are in fact doing simultaneous relighting and view synthesis all along.