Skip to the content.

We present a method enabling the scaling of NeRFs to learn a large number of semantically-similar scenes. We combine two techniques to improve the required training time and memory cost per scene. First, we learn a 3D-aware latent space in which we train Tri-Planes scene representations, hence reducing the resolution at which scenes are learned. Moreover, we present a way to share common information across scene representations, hence allowing for a reduction of model complexity to learn a particular scene. Our method reduces effective per-scene memory costs by 44% and per-scene time costs by 86% when training 1000 scenes.

Method

Figure

We learn a 3D-aware latent space by regularizing its training with 3D constraints. To this end, we jointly train an encoder, a decoder and N scenes lying in this latent space. For each scene s, we learn a Tri-Planes representation Ts, built from the concatenation of local Tri-Planes Tsmic and global Tri-Planes Tsmac. Tsmic is retrieved via a one-hot vector es from a set of scene-specific planes stored in memory. Tsmac is computed from a summation of M globally shared Tri-Planes, weighted with weights Ws.

Learning scenes with 3Da-AE

After training a 3D-aware autoencoder, we utilize it to train Tri-Plane scene representations in its latent space.

Comparison of Tri-Planes trained in our 3D-aware latent space and the baseline latent space


3Da-AE (ours)

Baseline AE

As illustrated above, our latent space is well suited for learning 3D scenes. To maximize the rendering quality in the image space, we proceed with a quick fine-tuning of the decoder, which we illustrate below with a comparison to RGB Tri-Planes.

Comparison of Latent Tri-Planes and RGB Tri-Planes


Our method

Tri-Planes (RGB)

Citation

@article{schnepf2024exploring,
      title={Exploring 3D-aware Latent Spaces for Efficiently Learning Numerous Scenes}, 
      author={Antoine Schnepf and Karim Kassab and Jean-Yves Franceschi and Laurent Caraffa and Flavian Vasile and Jeremie Mary and Andrew Comport and Valérie Gouet-Brunet},
      journal={arXiv preprint arXiv:2403.11678},
      year={2024}
}