I’ve been through Part 1 of the fastai course in its entirety, and I’m re-watching it all again now.
To aid my learning (and hopefully help somebody out), I’m trying to solve a problem for a friend using fastai, but wanted to get a few pointers from the more experienced folk here as to how to go about it.
They work in academia and take photographs of a patch of ground. They need to log the presence and frequency of a particular single entity (say “Mushrooms in this patch”).
I wasn’t sure how best to approach this problem, because (to my inexperienced mind) it’s not an Image Segmentation problem (à la Camvid), because they’re not concerned about the placement of the objects, nor is there an exhaustive list of potential labels. And it also doesn’t seem like an Image Classification problem, because it’s not a binary “Mushroom detected” or “No mushroom detected” situation, rather it’s “No mushroom detected” or “X mushrooms detected”.
Therefore I was wondering both how to approach the problem, and what the best architecture / approach to a solution might be.
I did search on the fastai forums before asking this question, but this was the closest thing I came across, which doesn’t quite help.
Any degree of feedback or guidance would be greatly appreciated!
You could do image regression where instead of points it’s x instances
@muellerzr Thank you! I’ll take a look into this immediately.
I would rather detect them and then count them. How many occurrences are we talking about for each image?
@tcapelle Interesting, okay! I will find that out and get back to you asap!
I’m also wondering whether it’s possible to use transfer learning here?
ie, Train a NN to identify “Mushroom or not”, then apply it to the new context of counting their appearances (if any) in an image?
You certainly could. However you may find training such a model difficult. One way to do so is to use a sigmoid function (like multi-label) which will tell you.
Sorry, one more question, and it’s a bit more meta than the specifics of my original question:
How would you go about searching for whether there already exists a pre-trained model that could work .vs. Choosing to train from scratch?
Do you simply use Google search operators, or is there a go-to place for this?
I usually assume I have to train my own unless it’s a well-ish known Kaggle dataset or something
Any reason I shouldn’t use a CNN, do you think? Josh Varty had a lot of success with one: https://github.com/JoshVarty/ImageClassification/blob/master/3_CountingAgain.ipynb
I don’t see any reason. That’s what I was recommending you use
In my opinion you want to create two models:
- One for classifying if mushrooms exist in a picture or not.
- Second to count the number of mushrooms.
For the latter the approach from the link you have posted seems good!
Sorry, I was confused. I forgot that a U-NET is a type of CNN My bad.
@chatuur Thank you! Apologies if this question betrays a total lack of understanding, but:
Is it possible to do it all with one model?
- I segment all of the images, so that the objects (let’s say mushrooms) are highlighted
- I train a model to identify them – A U-NET (with an ImageNet-pretrained resnet34 encoder)
- I somehow(?) count the number of segments for a given image, giving me the total. ie, As some kind of post-processing task with code, rather than using a neural network.