Tutorial: End-to-End Object Detection for Unity With IceVision and OpenVINO

cjmills · August 11, 2022, 10:35pm

This tutorial series covers training an object detector using the IceVision library and implementing the trained model in a Unity game engine project using OpenVINO.

The tutorial uses a downscaled subsample of HaGRID (HAnd Gesture Recognition Image Dataset). HaGRID contains 552,992 annotated sample images for 18 distinct hand gestures and an additional no_gesture class to account for idle hands.

The original dataset is a massive 716GB, so I made smaller versions available on Kaggle for those following the tutorial. Both are small enough to fit in the free GPU tiers of Google Colab and Kaggle Notebooks. Although, you will probably need to lower the image and batch size to run the larger of the two versions on the free tiers.

Demo Video

Blog Posts

Part 1: covers training and exporting the model
Part 2: covers creating a dynamic link library (DLL) file in Visual Studio to perform inference with OpenVINO
Part 3: covers performing object detection in a Unity project with OpenVINO

Kaggle Datasets

HaGRID Sample 30k 384p: contains 31,833 images from HaGRID (HAnd Gesture Recognition Image Dataset) downscaled to 384p
HaGRID Sample 120k 384p: contains 127,331 images from HaGRID (HAnd Gesture Recognition Image Dataset) downscaled to 384p

Training Notebooks

GitHub Repository

Previous Tutorial

Fastai to Unity Beginner Tutorial: This tutorial covers how to train an image classifier using the fastai library and implement it in a Unity game engine project using the Barracuda inference library.

cjmills · August 19, 2022, 8:49pm

I had someone with an AMD Ryzen CPU test the OpenVINO project, and it seems like OpenVINO only works for Intel hardware.

Since not everyone has an Intel CPU, and discrete Intel GPUs are not yet widely available, I made a follow-up tutorial showing how to use ONNX Runtime and DirectML instead of OpenVINO.

If you don’t know, DirectML is a hardware-accelerated DirectX 12 library for machine learning on Windows. It should work with basically any modern GPU.

This tutorial uses the ONNX models already generated in the training code from the original tutorial.

Blog Posts

Part1: covers creating a dynamic link library (DLL) file in Visual Studio to perform inference with ONNX Runtime and DirectML
Part 2: covers performing object detection in a Unity project with ONNX Runtime and DirectML

GitHub Repository

Daniel · August 20, 2022, 3:52am

Fastai with unity, sounds very interesting!

I have used unity a little bit a long time ago. I wonder you could make a video to recreate the demo of hand gesture recognition from the very first step of downloading unity to show us this is easily doable.

This will help more people to get started with both unity and Fastai.

Thanks

cjmills · August 23, 2022, 4:42pm

Hi Daniel,
I have only done written content up to this point, so I would need to investigate the best approach for making a video tutorial. At the very least, the microphone integrated into my webcam is not ideal. I’ll look into it, though.

Daniel · August 24, 2022, 2:20am

Thanks a lot! I am very much looking forward to the future videos!

The first ones don’t have to be perfect, just record every step without voice can still be great videos for beginners.

cjmills · August 24, 2022, 2:35am

That I can do