Neural Radiance Fields

bencoman · October 1, 2022, 5:49pm

Interesting that Tesla are considering NeRFs may provide “foundation models for computer vision because they are grounded in geometry and geometry provides a nice way to supervise networks, and frees us of the requirement to define an ontology”

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained) - YouTube
FastNeRF: High-Fidelity Neural Rendering at 200FPS [Extended] - YouTube

I dream of developing a system to 3D scan industrial equipment like these switchboards, and being able to zoom in close to read the ferruled wire numbers. I’m not sure NeRF would be the best way, but may be fun to explore.

matdmiller · October 2, 2022, 8:26pm

I find NeRF’s fascinating as well. instant-ngp by Nvidia is extremely impressive in how fast it trains (30-60 seconds) and the massive speedups they were able to achieve over older NeRF papers. If you’ve worked with photogrammetry before, it’s easy to recognize the breakthrough in quality and speed this will provide. It’s still pretty rough around the edges currently and I think it needs a few more iterations to be truly useful, but it seems extremely promising. instant-ngp is a little tricky to get set up, but it’s pretty fun to play with. I hope they extend it to incorporate SLAM, which is a pre-processing step that calculates the location of the camera for each image, in the next iteration as that is by far the slowest part currently and is not included in the 30-60s training time.

Here is a nerf I did of a shelf in my basement. The results aren’t super clean, especially compared to some of the examples I’ve seen, but I think it gives you a pretty realistic expectation of what you currently get from the network.

nerf1_AdobeExpress (5)

Full Quality Video Rendering:

Jonathan Stephens posts quite a few impressive nerf’s on Twitter.

One thing to watch out for to temper your expectations is that many nerf video renderings follow the same path as the original camera does which makes the results look better than they would if random viewpoints are shown. The ‘floaters’ are much less visible when the rendering camera track matches the track that the camera that captured the input images took and is something that is very easy to do in the instant-ngp rendering application. This makes sense as ‘floaters’ are incorrect positional predictions and these incorrect predictions must still make sense visually from the original positions and perspectives of the input images that were used to train the model.

SHAR1 · October 3, 2022, 9:21am

It a good problem statement, as this paper shows ([2203.04424] SLAM-Supported Self-Training for 6D Object Pose Estimation); self supervised training from slam can help in having consistency among the pose estimates. The camera pose can be derived from the object pose using extrinsics (of the camera). This pseudo camera pose can be used as decent geometrical prior for NERF’s.

bencoman · October 3, 2022, 4:20pm

Good idea. Looking around I found…

within which were a few I liked…

Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping [arxive] - SLAM up to 800x faster than baselines with superior rendering outcomes.
Neural Implicit Scalable Encoding for SLAM - dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation
iMAP - Implicit Mapping and Positioning in Real-Time - live operation without prior data, building a dense, scene-specific implicit 3D model of occupancy and colour which is also immediately used for tracking
Stero Radiance Fields (SRF): Learning View Synthesis For Spaarse Views of Novel Scenes - NeRF designed to learn a single scene with a neural network from scratch, which takes 2-3 days. SRF learns structure instead of over-fitting on a scene. SRF generalizes to new scenes in 10-15 minutes of fine-tuning achieving significantly sharper, more detailed results than scene specific models. [Video]
BARF : Bundle-Adjusting Neural Radiance Fields [arxive] - One limitation of NeRF is its requirement of accurate camera poses to learn the scene representations. BARF facilitates for training NeRF from imperfect (or even unknown) camera poses through joint learning neural 3D representations and registering camera frames [Video].

riven314 · October 4, 2022, 10:40am

Great to see folks here who are also interested in NeRF!
It has been a pretty active research field recently. One of the impressive use cases is to scale up NeRF models to city street views. (I wrote a blogpost to summarise its paper, in case you are interested to learn more)

instant-ngp is promising, though it’s core part is written in CUDA which is quite unfamiliar to me. (I wonder if it’s feasible to port the part into JAX to make the code more accessible without compromising too much of the speed gain)

Maybe relevant to the discussion here, Self-Calibrating Neural Radiance Fields is one variant of NeRF which jointly learn the 3D scene and camera parameters.

riven314 · October 4, 2022, 10:47am

this is nice experiment! For your training dataset, how did you collect the images and its camera poses?

s.s.o · October 4, 2022, 11:24am

matdmiller · October 4, 2022, 1:51pm

Thanks! I just took pictures with my phone and used the colmap script included in the repo to calculate the camera poses. Calculating the poses is pretty slow with the script which is why I was talking about SLAM in my post. I’ve started to read some of the suggestions in the responses to look into faster SLAM.

riven314 · October 25, 2022, 6:22pm

FYI
Recently found this project - Using custom data - nerfstudio
This is pretty solid and integrated
Worth taking a look if you are interested in the field!

johnrobinsn · October 25, 2022, 6:25pm

thanks for sharing… I’m very interested in nerfs (especially single shot or few sample variants)… Also just about anything that infers 3d structures from pixels (occupancy networks etc).

javismiles · October 27, 2022, 1:26pm

this is fantastic Mat, love it, specially because latest iteration this would be very useful for a project Im preparing where actually the imperfections would add a positive feeling to the result, did you train that one with the code at GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more I assume? thank you for sharing

javismiles · October 27, 2022, 1:30pm

great stuff, so you take a bunch of pics with the phone from different angles and use that script to calculate the poses, when you say very slow, how slow do you mean? what hardware and speed are we talking about here? thank you again for sharing, I will try to experiment with this latest version

matdmiller · October 27, 2022, 2:00pm

Thanks!

Yep

Calculating the camera poses takes roughly 10-100x more time than training the Nerf itself when using the provided colmap script. It’s been a while since I did the shelf Nerf but it was between 50-100 images I believe and the colmap script took somewhere between 10 min and an hour, I got bored waiting so I left and came back so don’t know exactly how long it took. Training the Nerf took around 1 min on a 3090.

bencoman · October 29, 2022, 3:57am

I found this a nice NERF backgrounder, from a very basic level that I hadn’t seen before.

javismiles · November 1, 2022, 2:42pm

great stuff, I have been experimenting with instant Nerf and it works really great, and pretty fast with an RTX 3090. Next steps, a few things to get your perspective on:

has anybody tried the new Nerf Studio?
GitHub - nerfstudio-project/nerfstudio: A collaboration friendly studio for NeRFs
I like the native nvlabs instant nerf interface, but curious to know if anybody has tried that alternative
Stable dreamfusion in pytorch
GitHub - ashawkey/stable-dreamfusion: A pytorch implementation of text-to-3D dreamfusion, powered by stable diffusion.
has anybody given it a try, authors comment that its still buggy and unstable, have you seen any other implementations of such mix?
exporting meshes from nerf, marching cubes alg
Export 3D Object from Nvidia (instant-ngp) NeRF and load it into Blender and MeshLab - YouTube
The exported meshes don’t have a great quality, they are kind of irregular, anybody has experimented with this?
3d human models from nerf
I am very interested as well in the creation of full human 3d models that are posable from nerf, see this:
EVA3D - Project Page
and many other references, anything that you guys have seen that is worth exploring regarding this area feel free to share it here as well

onwards

matdmiller · November 1, 2022, 6:59pm

This is my experience too. Definitely something that needs to be improved for using NERF to create 3D assets.

johnrobinsn · November 2, 2022, 3:20pm

I found this paper, Baking Neural Radiance Fields for Real-Time View Synthesis that post processes a trained nerf into a voxel-based representation that can render interactively (once loaded). Quality is pretty good. I have yet to pull the code and do anything with it…

includes a youtube video giving a high level overview of the paper.

at the bottom of that link there are links to interactive (webgl-based) viewers of the classic nerf examples… (takes a bit to load… but once loaded pretty responsive…

johnrobinsn · November 3, 2022, 12:09pm

Another interesting paper… training nerfs from diffusion models…

project page with paper link…

interactive webgl viewers at the bottom…

javismiles · November 4, 2022, 1:17pm

interesting @johnrobinsn , voxel representation, I wonder if that could then be imported into Blender, Houdini, etc because this is one of the issues we have, the marching cubes alg that we use to export Nerfs to Blender of Houdini produces 3d models that are just not good enough, pretty rough

javismiles · November 4, 2022, 1:19pm

yes @johnrobinsn , they use google imagen I think, there is a pytorch implementation that uses stable diffusion:

however, they indicate:
" This project is a work-in-progress , and contains lots of differences from the paper. Also, many features are still not implemented now. The current generation quality cannot match the results from the original paper, and many prompts still fail badly!"

So this is a great direction, we gotta keep an eye on any combo of Nerfs+Guided Diffusion

Im also very interested in the latest research about using Nerf to create 3d models of human figures that you can then pose and animate, this is already being done, but its recent research and havent found code or systems that we can test yet