Lesson 1 - Official topic

sttaq · November 27, 2020, 6:30am

Hello FastAI community,

I am new learner of ML and related stuff. I have been experimenting with Tensorflow and playing with teachable machine however, I was having trouble with a large number of false positives returned by my image recognition models. This led me to search for better ways and I ended up doing this course.

So, I started with Lesson 1, and thanks to Jeremy Howard excellent style of the book, I was able to train the cat vs dog model pretty quickly. However, when I pass this model images of babies having two pony tails they get categorized as cats with more than 90% confidence. I believe that the data set used in the book must be of very good quality compared to what I can collect on my own and if that data set is not good enough to train an accurate model than mine’s will never going to work

I’d appreciate if anyone can guide me on what sort of steps one can take to reduce chances of having false positives? Thanks

mrfabulous1 · November 27, 2020, 10:24am

Hi sttaq I hope you are having marvelous day!

You are likely always to have some false positives and negatives unless your model is a 100% accurate and recognises every image sent to it.

Here are some links that discuss what you are experiencing. Your model is acting as expected based on the data it was trained on and the image sent.

One of Jeremy’s key points is making sure that, test data contains some of the same images, that a model will see in production.

Your model is behaving as expected, you can try some of the ideas in the above threads, multilabel classification and training your model with more images to help improve your model.

Hope this helps

Cheers mrfabulous1

sttaq · November 27, 2020, 4:21pm

Thanks @mrfabulous1, I’ll go through the links.

mdengler · February 17, 2021, 8:09pm

Hi – I’m seeing the “CLICK ME” “first_training” cell take a very long time (>15min for first epoch) to train/fine_tune on Google Colab, even with a GPU instance. Is this expected?

f2forest · February 18, 2021, 1:14am

I’m having the same problem now. This does not seem to be normal. I ran the same code on an AWS GPU instance and it took only a minute. There seems to be a problem with Colab.

jeremy · April 12, 2022, 11:55pm

Personally, I have a local checkout of fastai and fastcore. I open notebooks I’m working on there. I use doc to get links to the documentation on docs.fast.ai. I know the source well enough I don’t really need an IDE to help jump to definitions of functions, but if I did I’d open the fastai modules directory in vscode or vim to allow me to directly jump to definitions.

jeremy · April 13, 2022, 12:26am

You’ll find out in lesson 3

mike.moloch · April 15, 2022, 12:09am

I’ve had similar experiences with this part of the notebook; small changes flip the sentiment from pos to neg . I would think that such changes would/should not affect the overall sentiment of a text fragment, but they tend to do that. So, I’m not sure how robust this actually is.

mike.moloch · April 15, 2022, 12:11am

I’ve seen similar results when uploading completely unrelated pics (like I uploaded picture of a child and it classified it as “cat” lol) … I’m not sure if it’s worthwhile adding a third output which would basically say “other” … ?

crayoneater · April 17, 2022, 3:09am

Outstanding question!! I’ll have to admit, at first I was wondering how practical a question this would be. I thought, maybe some people might build something Perceptron-like for their own amusement (heck, I still play DOS games and watch MAS*H), but it would seem like a massive technological step backward. But then I thought, maybe there are still applications for physical NN’s today, like some specialized devices that somehow need a NN framework.

Turns out, NTT/Cornell have done exactly that, with some results that are faster (though not more accurate) than traditional computer processing. First paragraph of this tantalizing article: “You may not be able to teach an old dog new tricks, but Cornell researchers have found a way to train physical systems, ranging from computer speakers and lasers to simple electronic circuits, to perform machine-learning computations, such as identifying handwritten numbers and spoken vowel sounds.”

mike.moloch · April 17, 2022, 11:56am

I think IBM did some work in Neuromorphic computing and there were even some chips made. From what I gather, ANN research is sort of “stuck” in the very early models of neural networks that came out in the 1940’s and 50’s. Curren ANNs are basically giant functions with billions of parameters. IMO the “intelligence” in biological systems arises out of the interconnections of neurons. Single biological neurons seem to do computations which rival that of what we refer to “neural networks” these days.

I feel that if we are ever to achieve any sort of intelligent behaviour in silicon, there will need to be more interdisciplinary work between what Neuroscience has discovered and how that can be modeled and simulated in the ANN arena.

One huge stumbling block is that the term intelligence is automatically assumed to be “human” intelligence. Before we get to human level intelligence, we need to get to Fly level intelligence and there is a lot that can be done by networks that exhibit “fly like intelligence”. One obvious example would be navigation in 3D space. A fly can navigate in three dimensional space without catastrophic failure for the system (death?) with only 100,000 neurons. And those 100k neurons are doing all the things need to keep the system alive to the point where it can reproduce and transfer genes to the next gen.

So, we have a long way to go, and current implementations of NNs are just narrowly focused algorithms whose capabilities pale in comparison to what actually, demonstrably, can be accomplished using physical computing units (biological neurons for example).