Neural Radiance Fields

Interesting that Tesla are considering NeRFs may provide “foundation models for computer vision because they are grounded in geometry and geometry provides a nice way to supervise networks, and frees us of the requirement to define an ontology”

I dream of developing a system to 3D scan industrial equipment like these switchboards, and being able to zoom in close to read the ferruled wire numbers. I’m not sure NeRF would be the best way, but may be fun to explore.


I find NeRF’s fascinating as well. instant-ngp by Nvidia is extremely impressive in how fast it trains (30-60 seconds) and the massive speedups they were able to achieve over older NeRF papers. If you’ve worked with photogrammetry before, it’s easy to recognize the breakthrough in quality and speed this will provide. It’s still pretty rough around the edges currently and I think it needs a few more iterations to be truly useful, but it seems extremely promising. instant-ngp is a little tricky to get set up, but it’s pretty fun to play with. I hope they extend it to incorporate SLAM, which is a pre-processing step that calculates the location of the camera for each image, in the next iteration as that is by far the slowest part currently and is not included in the 30-60s training time.

Here is a nerf I did of a shelf in my basement. The results aren’t super clean, especially compared to some of the examples I’ve seen, but I think it gives you a pretty realistic expectation of what you currently get from the network.

nerf1_AdobeExpress (5)

Full Quality Video Rendering:

Jonathan Stephens posts quite a few impressive nerf’s on Twitter.

One thing to watch out for to temper your expectations is that many nerf video renderings follow the same path as the original camera does which makes the results look better than they would if random viewpoints are shown. The ‘floaters’ are much less visible when the rendering camera track matches the track that the camera that captured the input images took and is something that is very easy to do in the instant-ngp rendering application. This makes sense as ‘floaters’ are incorrect positional predictions and these incorrect predictions must still make sense visually from the original positions and perspectives of the input images that were used to train the model.


It a good problem statement, as this paper shows ([2203.04424] SLAM-Supported Self-Training for 6D Object Pose Estimation); self supervised training from slam can help in having consistency among the pose estimates. The camera pose can be derived from the object pose using extrinsics (of the camera). This pseudo camera pose can be used as decent geometrical prior for NERF’s.


Good idea. Looking around I found…

within which were a few I liked…


Great to see folks here who are also interested in NeRF!
It has been a pretty active research field recently. One of the impressive use cases is to scale up NeRF models to city street views. (I wrote a blogpost to summarise its paper, in case you are interested to learn more)

instant-ngp is promising, though it’s core part is written in CUDA which is quite unfamiliar to me. (I wonder if it’s feasible to port the part into JAX to make the code more accessible without compromising too much of the speed gain)

Maybe relevant to the discussion here, Self-Calibrating Neural Radiance Fields is one variant of NeRF which jointly learn the 3D scene and camera parameters.


this is nice experiment! For your training dataset, how did you collect the images and its camera poses?


Thanks! I just took pictures with my phone and used the colmap script included in the repo to calculate the camera poses. Calculating the poses is pretty slow with the script which is why I was talking about SLAM in my post. I’ve started to read some of the suggestions in the responses to look into faster SLAM.


Recently found this project - Using custom data - nerfstudio
This is pretty solid and integrated
Worth taking a look if you are interested in the field!

1 Like

thanks for sharing… I’m very interested in nerfs (especially single shot or few sample variants)… Also just about anything that infers 3d structures from pixels (occupancy networks etc).

this is fantastic Mat, love it, specially because latest iteration this would be very useful for a project Im preparing where actually the imperfections would add a positive feeling to the result, did you train that one with the code at GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more I assume? thank you for sharing :slight_smile:

1 Like

great stuff, so you take a bunch of pics with the phone from different angles and use that script to calculate the poses, when you say very slow, how slow do you mean? what hardware and speed are we talking about here? thank you again for sharing, I will try to experiment with this latest version

1 Like



Calculating the camera poses takes roughly 10-100x more time than training the Nerf itself when using the provided colmap script. It’s been a while since I did the shelf Nerf but it was between 50-100 images I believe and the colmap script took somewhere between 10 min and an hour, I got bored waiting so I left and came back so don’t know exactly how long it took. Training the Nerf took around 1 min on a 3090.


I found this a nice NERF backgrounder, from a very basic level that I hadn’t seen before.


great stuff, I have been experimenting with instant Nerf and it works really great, and pretty fast with an RTX 3090. Next steps, a few things to get your perspective on:



This is my experience too. Definitely something that needs to be improved for using NERF to create 3D assets.

1 Like

I found this paper, Baking Neural Radiance Fields for Real-Time View Synthesis that post processes a trained nerf into a voxel-based representation that can render interactively (once loaded). Quality is pretty good. I have yet to pull the code and do anything with it…

includes a youtube video giving a high level overview of the paper.

at the bottom of that link there are links to interactive (webgl-based) viewers of the classic nerf examples… (takes a bit to load… but once loaded pretty responsive…


Another interesting paper… training nerfs from diffusion models…

project page with paper link…

interactive webgl viewers at the bottom…


interesting @johnrobinsn , voxel representation, I wonder if that could then be imported into Blender, Houdini, etc because this is one of the issues we have, the marching cubes alg that we use to export Nerfs to Blender of Houdini produces 3d models that are just not good enough, pretty rough

1 Like

yes @johnrobinsn , they use google imagen I think, there is a pytorch implementation that uses stable diffusion:

however, they indicate:
" This project is a work-in-progress , and contains lots of differences from the paper. Also, many features are still not implemented now. The current generation quality cannot match the results from the original paper, and many prompts still fail badly!"

So this is a great direction, we gotta keep an eye on any combo of Nerfs+Guided Diffusion

Im also very interested in the latest research about using Nerf to create 3d models of human figures that you can then pose and animate, this is already being done, but its recent research and havent found code or systems that we can test yet

1 Like