A curated list of resources on implicit neural representations, inspired by awesome-computer-vision.
列表经过简化,以方便进行毕设调研
Implicit Neural Representations (sometimes also referred to as coordinate-based representations) are a novel way to parameterize signals of all kinds. Conventional signal representations are usually discrete - for instance, images are discrete grids of pixels, audio signals are discrete samples of amplitudes, and 3D shapes are usually parameterized as grids of voxels, point clouds, or meshes. In contrast, Implicit Neural Representations parameterize a signal as a continuous function that maps the domain of the signal (i.e., a coordinate, such as a pixel coordinate for an image) to whatever is at that coordinate (for an image, an R,G,B color). Of course, these functions are usually not analytically tractable - it is impossible to "write down" the function that parameterizes a natural image as a mathematical formula. Implicit Neural Representations thus approximate that function via a neural network.
Implicit Neural Representations have several benefits: First, they are not coupled to spatial resolution anymore, the way, for instance, an image is coupled to the number of pixels. This is because they are continuous functions! Thus, the memory required to parameterize the signal is independent of spatial resolution, and only scales with the complexity of the underyling signal. Another corollary of this is that implicit representations have "infinite resolution" - they can be sampled at arbitrary spatial resolutions.
This is immediately useful for a number of applications, such as super-resolution, or in parameterizing signals in 3D and higher dimensions, where memory requirements grow intractably fast with spatial resolution. Further, generalizing across neural implicit representations amounts to learning a prior over a space of functions, implemented via learning a prior over the weights of neural networks - this is commonly referred to as meta-learning and is an extremely exciting intersection of two very active research areas! Another exciting overlap is between neural implicit representations and the study of symmetries in neural network architectures - for intance, creating a neural network architecture that is 3D rotation-equivariant immediately yields a viable path to rotation-equivariant generative models via neural implicit representations.
Another key promise of implicit neural representations lie in algorithms that directly operate in the space of these representations. In other words: What's the "convolutional neural network" equivalent of a neural network operating on images represented by implicit representations?
This is a list of Google Colabs that immediately allow you to jump in and toy around with implicit neural representations!
- Implicit Neural Representations with Periodic Activation Functions shows how to fit images, audio signals, and even solve simple Partial Differential Equations with the SIREN architecture.
- Neural Radiance Fields (NeRF) shows how to fit a neural radiance field, allowing novel view synthesis of a single 3D scene.
- MetaSDF & MetaSiren shows how you can leverage gradient-based meta-learning to generalize across neural implicit representations.
- Neural Descriptor Fields Learn how you can use globally conditioned neural implicit representations as self-supervised correspondence learners, enabling robotics imitation tasks.
The following three papers first (and concurrently) demonstrated that implicit neural representations outperform grid-, point-, and mesh-based representations in parameterizing geometry and seamlessly allow for learning priors over shapes.
- DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation (Park et al. 2019)
- Occupancy Networks: Learning 3D Reconstruction in Function Space (Mescheder et al. 2019)
- IM-Net: Learning Implicit Fields for Generative Shape Modeling (Chen et al. 2018)
Since then, implicit neural representations have achieved state-of-the-art-results in 3D computer vision:
- Sal: Sign agnostic learning of shapes from raw data (Atzmon et al. 2019) shows how we may learn SDFs from raw data (i.e., without ground-truth signed distance values)
- Implicit Geometric Regularization for Learning Shapes (Gropp et al. 2020) shows how we may learn SDFs from raw data (i.e., without ground-truth signed distance values)
- Local Implicit Grid Representations for 3D Scenes, Convolutional Occupancy Networks, Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction concurrently proposed hybrid voxelgrid/implicit representations to fit large-scale 3D scenes.
- Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al. 2020) demonstrates how we may parameterize room-scale 3D scenes via a single implicit neural representation by leveraging sinusoidal activation functions.
- Neural Unsigned Distance Fields for Implicit Function Learning (Chibane et al. 2020) proposes to learn unsigned distance fields from raw point clouds, doing away with the requirement of water-tight surfaces.
- Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization (Saito et al. 2019) Pifu first introduced the concept of conditioning an implicit representation on local features extracted from context images. Follow-up work achieves photo-realistic, real-time re-rendering.
- Texture Fields: Learning Texture Representations in Function Space (Oechsle et al.)
- Occupancy flow: 4d reconstruction by learning particle dynamics (Niemeyer et al. 2019) first proposed to learn a space-time neural implicit representation by representing a 4D warp field with an implicit neural representation.
The following papers concurrently proposed to leverage a similar approach for the reconstruction of dynamic scenes from 2D observations only via Neural Radiance Fields.
- D-NeRF: Neural Radiance Fields for Dynamic Scenes
- Deformable Neural Radiance Fields
- Neural Radiance Flow for 4D View Synthesis and Video Processing
- Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
- Space-time Neural Irradiance Fields for Free-Viewpoint Video
- Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video
- Vector Neurons: A General Framework for SO(3)-Equivariant Networks (Deng et al. 2021) makes conditional implicit neural representations equivariant to SO(3), enabling the learning of a rotation-equivariant shape space and subsequent reconstruction of 3D geometry of single objects in unseen poses.
The following four papers concurrently proposed to condition an implicit neural representation on local features stored in a voxelgrid:
- Implicit Functions in Feature Space for 3D ShapeReconstruction and Completion
- Local Implicit Grid Representations for 3D Scenes
- Convolutional Occupancy Networks
- Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
This has since been leveraged for inverse graphics as well:
- Neural Sparse Voxel Fields Applies a similar concept to neural radiance fields.
- Pixel-NERF (Yu et al. 2020) proposes to condition a NeRF on local features lying on camera rays, extracted from contact images, as proposed in PiFU (see "from 3D supervision").
The following papers condition a deep signed distance function on local patches:
- Local Deep Implicit Functions for 3D Shape
- PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
- Inferring Semantic Information with 3D Neural Scene Representations leverages features learned by Scene Representation Networks for weakly supervised semantic segmentation of 3D objects.
- Neural Descriptor Fields: SE(3)-Equvariant Object Representations for Manipulation leverages features learned by occupancy networks to establish correspondence, used for robotics imitation learning.
- DeepSDF, Occupancy Networks, IM-Net concurrently proposed conditioning via concatenation.
- Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization (Saito et al. 2019) proposed to locally condition implicit representations on ray features extracted from context images.
- Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations (Sitzmann et al. 2019) proposed meta-learning via hypernetworks.
- MetaSDF: MetaSDF: Meta-Learning Signed Distance Functions (Sitzmann et al. 2020) proposed gradient-based meta-learning for implicit neural representations
- SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images (Lin et al. 2020) show how to learn 3D implicit representations from single-image supervision only.
- Learned Initializations for Optimizing Coordinate-Based Neural Representations (Tancik et al. 2020) explored gradient-based meta-learning for NeRF.
- Neural Radiance Fields (NeRF) (Mildenhall et al. 2020) proposed positional encodings.
- Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al. 2020) proposed implicit representations with periodic nonlinearities.
- Fourier features let networks learn high frequency functions in low dimensional domains (Tancik et al. 2020) explores positional encodings in an NTK framework.
- Compositional Pattern-Producing Networks: Compositional pattern producing networks: A novel abstraction of development (Stanley et al. 2007) first proposed to parameterize images implicitly via neural networks.
- Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al. 2020) proposed to generalize across implicit representations of images via hypernetworks.
- X-Fields: Implicit Neural View-, Light- and Time-Image Interpolation (Bemana et al. 2020) parameterizes the Jacobian of pixel position with respect to view, time, illumination, etc. to naturally interpolate images.
- Learning Continuous Image Representation with Local Implicit Image Function (Chen et al. 2020) proposed a hypernetwork-based GAN for images.
- Alias-Free Generative Adversarial Networks (StyleGAN3) uses FILM-conditioned MLP as an image GAN.
The following papers propose to assemble scenes from per-object 3D implicit neural representations.
- GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields (Niemeyer et al. 2021)
- Object-centric Neural Rendering (Guo et al. 2020)
- Unsupervised Discovery of Object Radiance Fields (Yu et al. 2021)
- Implicit Geometric Regularization for Learning Shapes (Gropp et al. 2020) learns SDFs by enforcing constraints of the Eikonal equation via the loss.
- Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al. 2020) proposes to leverage the periodic sine as an activation function, enabling the parameterization of functions with non-trivial higher-order derivatives and the solution of complicated PDEs.
- AutoInt: Automatic Integration for Fast Neural Volume Rendering (Lindell et al. 2020)
- MeshfreeFlowNet: Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework (Jiang et al. 2020) performs super-resolution for spatio-temporal flow functions using local implicit representaitons, with auxiliary PDE losses.
- Generative Radiance Fields for 3D-Aware Image Synthesis (Schwarz et al. 2020)
- pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis (Chan et al. 2020)
- Unconstrained Scene Generation with Locally Conditioned Radiance Fields (DeVries et al. 2021) Leverage a hybrid implicit-explicit representation, by generating a 2D feature grid floorplan with a classic convolutional GAN, and then conditioning a 3D neural implicit representation on these features. This enables generation of room-scale 3D scenes.
- Alias-Free Generative Adversarial Networks (StyleGAN3) uses FILM-conditioned MLP as an image GAN.
For 2D image synthesis, neural implicit representations enable the generation of high-resolution images, while also allowing the principled treatment of symmetries such as rotation and translation equivariance.
- Adversarial Generation of Continuous Images (Skorokhodov et al. 2020)
- Learning Continuous Image Representation with Local Implicit Image Function (Chen et al. 2020)
- Image Generators with Conditionally-Independent Pixel Synthesis (Anokhin et al. 2020)
- Alias-Free GAN (Karras et al. 2021)
- Spatially-Adaptive Pixelwise Networks for Fast Image Translation (Shaham et al. 2020) leverages a hybrid implicit-explicit representation for fast high-resolution image2image translation.
- NASA: Neural Articulated Shape Approximation (Deng et al. 2020) represents an articulated object as a composition of local, deformable implicit elements.
- Vincent Sitzmann: Implicit Neural Scene Representations (Scene Representation Networks, MetaSDF, Semantic Segmentation with Implicit Neural Representations, SIREN)
- Andreas Geiger: Neural Implicit Representations for 3D Vision (Occupancy Networks, Texture Fields, Occupancy Flow, Differentiable Volumetric Rendering, GRAF)
- Gerard Pons-Moll: Shape Representations: Parametric Meshes vs Implicit Functions
- Yaron Lipman: Implicit Neural Representations
- awesome-NeRF - List of implicit representations specifically on neural radiance fields (NeRF)
License: MIT