What happens when you train a model purely on synthetic images?


I ran an interesting experiment: can you train a CNN purely on synthetic images, and afterwards classify real images with this model? For this I created a synthetic data set of 100.000 images of dice using Blender and Unity3D. Here’s the video: https://youtu.be/EE_DX_kPd3E

Enjoy, Christian

That is really cool thanks for sharing. For someone who knows nothing about Blender and Unity how much work is involved in creating these synthetic images?

Good question. So far, each stage of the pipeline has been handled by specialists. And democratizing means making it easier for non-specialists to use the tools. For ML experts 3D modeling in Blender is probably the hardest part. But there are tons of free models available on the internet, so you might not need to get into 3D modeling at all. Here’s a playlist from Adam Kelly on the basics of Blender for AI experts:https://www.youtube.com/watch?v=xeprI8hJAH8&list=PLq7npTWbkgVAt4cnrsEzouM6kDKySnkRI You can also do all the image generation in Blender, not using Unity. The advantage is you need only to learn one tool. Also, Blender uses Python for scripting.
Unity is much more user friendly than Blender, and creating scenes is much easier. There are tons of good tutorials and courses available. But the perception framework is still in early stage, and there is not much available on this. This will change over time. For reinforcement learning Unity has already become one of the standard environments, so they managed to bridge the gap. So you can expect entry-level training material for AI experts in future to help getting into Unity.
I’m working on a video explaining in more depth what I did for my dice example. This might help you getting a better impression.