In terms of what models, I did a competition once with Yolo (darknet) but I did not train it with fastai. That being said though, if you want to go that route, I have inference scripts on our repository
For training in v1, I have a repo here you can use & notebook:
Man you are everywhere! Thanks for the thoughtful reply and info.
How did you train the yolo model?
We’re looking at yolo because its small and fast … and as this will be running on a pi or nano, that I think will be important. Currently we only have one object we are attempting to detect … a yellow ball, of which there will be dozens of on the game field.
Also, what yolo model would you recommend given our task and device constraints?
(Model is 4mb). I wound up setting it up on a p3 instance and training for what the recommended number of epochs was (I think 11,000), but for that simple of a problem a few thousand may be all that’s needed.
Do you have an example of what your training set looks like? We’re trying to understand what x and y refer to and how to structure the dataset for training.
@wgpubs (sorry this is taking me a moment, it’s been about a year or so since we did the competition, its all coming back). The exact version we trained with was:
The readme goes into a very large amount of detail building your custom datasets
Full YOLO in real time at non-trivial frame rate is difficult on pi and even Nano. How many fps are you expecting? I haven’t done it myself but I’ve seen 20fps YOLO tiny.
Creating your own high quality training sample is hard work but ultimately pays off.
You need to be efficient or possibly consider e.g. Amazon mechanical turk (I haven’t tried the latter but came close).
Have an initial fast process for review including crop or discard. Consider running through pretrained model or other heuristics to weed out inappropriate images.
I use labelImg ( https://github.com/tzutalin/labelImg ) and save in VOC XML format. (I was not initially using YOLO but stuck with VOC XML after I swapped over). Learn the hotkeys, use autosave, get fast at it. Quality does matter but pixel perfect bounding boxes aren’t super important IME.
We’d love to get at least 60fps if not more. Do you think that would be possible if we capture images with a small resolution? Something like 640x480? We’re using a raspberry pi camera v2 module.
I haven’t done much real time video testing sorry so not sure of the best optimisations. However I don’t think 60fps is remotely feasible even at 640x480 on a Nano (which after all is quite a cheap device and cost is like a 2nd hand GTX980). I guess 15fps tops, perhaps more with YOLO-tiny.
Is there a particular reason to use a Nano (e.g. size/power consumption)? Bang-for-buck wise you might be better off with a 2nd hand gaming laptop or build based on mini-ITX (I have an old mini-ITX z97/4590/GTX960/Corsair vengeance setup that I hope to use for my project- despite it’s age I think it would still blow a Nano out of the water just going by raw specs.)
Yah this is going to be running on a FRC (First Robotics Competition) robot. There are both size and power constraints for everything the team is attempting to do. If you’re desperate for reading material, you can get an idea of what we’re working with here.
Haha cool- I don’t have time to read now but might take a look. I suggest make a list of the basic constraints (power, space, minimum tolerable fps for your application, desired accuracy), how much time you can devote to this, what your coding ability is (determines whether you can do some custom optimisations etc).