New Training Dataset: Deadly Mushrooms

OlgaUW · October 17, 2019, 1:49pm

Hi All,

A lot of mushrooms of type Amanita like Death Cap (Amanita Phalloides) are extremely poisonous. The toxin works slowly by destroying one’s liver in 24-48 hours where usually only an urgent liver transplant can save a person. Unfortunately, in a lot of countries mushroom picking is still a very popular activities and Death Cap can be easily confused with a lot of edible mushrooms by novices.

I tried to train my resnet34 model on google images of some of the most “confusable” mushroom. My error rate is still pretty bad at >30%, learning rate plot looks pretty flat and past e-3 it shoots up I am guessing there are a few reasons for it:

Too much garbage in images
Mushrooms REALLY look alike
Need to identify parts of mushrooms, groups of mushrooms, etc.

Is there hope to be within 1-5% accurate?

bwarner · October 17, 2019, 5:14pm

Hi @OlgaUW, welcome to the fastai community.

Without seeing your code or images, there are several potential issues I can think of

Not enough images per class
Images have a high label error rate
Images fed to the model are too small to distinguish defining features between species
ResNet34 probably isn’t the best model architecture for this problem
A different set of data augmentations might help the model generalize better

A quick google search shows there is a mushroom dataset available from the
Natural History Museum of Denmark. I’d try using that dataset as a base.

Pomo · October 17, 2019, 6:16pm

Hi Olga and welcome,

As a long-time mushroom picker, your post made me smile. Most educated amateurs completely avoid the entire Amanita and Galerina genuses because it is difficult to precisely key out the species, and any error could have huge consequences.

To positively identify a mushrooms from a photo is likely impossible. In practice, you would pick it up and look from various angles to see the gills, cap, base, and stipe closeup in their forms and colors. You might feel for oiliness or brittleness. You might smell it, or taste a bit. You’d look at the season and habitat. You would assess how confident you are about those observations. All those factors are integrated to come up with a likely identification. And if the factors for that id are definitively far from anything deadly poisonous, you might choose to eat it.

So my guess is that 1-5% error rate classifying confusable mushrooms from a single photo is not attainable. And identifying deadly poisonous ones will not be safe unless the error rate is .000000001%. My suggestion is to start with an easier task at the genus level, because these broad categories are easier to distinguish visually. For example, “might possibly be Amanita” vs. “definitely not Amanita”. If any Amanitas at all fall into “definitely not”, you will have some fine tuning to do.

HTH,
Malcolm