Lesson 1 official topic

Background: I am trying to create a model that can differentiate between 11 classes, based on my current understanding, all the probabilities should sum up to 1. When I was looking out through predictions for one of my OOT image I saw this not holding true. Can someone help me out in understanding the missing piece in my learning? Thanks in advance.

Thank you, that worked!

1 Like

Hi,

Iā€™ve created a dog and cat classifiers, but the outputs are strange.
image

I changed probs[0] to probs[1] and it worked. My question is how do I know which index is associated to which animal? I respected the name order of the bird classifier, shouldnā€™t it be the same? Dog first, cat second.
image

Iā€™ve reviewed the forum, it seems to be classified in alphabetic order. so probs[0] would be cat and probs [1] to be dog. All good!

1 Like

Hi, could you share more about how you obtained those predictions?
All other imageā€™s probs are summing 100% except for one image?

Iā€™m on Kaggle and Iā€™m getting an error message right at the very start:

from fastdownload import download_url
dest = 'bird.jpg'
download_url(urls[0], dest, show_progress=False)

from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/opt/conda/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

/opt/conda/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    698     """A pprint that just redirects to the normal repr function."""
    699     # Find newlines and replace them with p.break_()
--> 700     output = repr(obj)
    701     lines = output.splitlines()
    702     with p.group():

/opt/conda/lib/python3.7/site-packages/fastai/vision/core.py in __repr__(x)
     26 @patch
     27 def __repr__(x:Image.Image):
---> 28     return "<%s.%s image mode=%s size=%dx%d at 0x%X>" % (x.__class__.__module__, x.__class__.__name__, x.mode, x.size[0], x.size[1])
     29 
     30 # %% ../nbs/07_vision.core.ipynb 11

TypeError: not enough arguments for format string

Any ideas?

@Billy, replace the underscore with variable ā€˜ndxā€™
e.gā€¦ pred,ndx,probs = learn.predict(...)
then probs[ndx] will give the probability of the prediction.

1 Like

Got problem when trying to acquire a larger dataset, Keyerror: 'Next' when using search_iamges, anyone knows why?
My effort: tried different max_images parameter, and when it is less than 400 it works fine, but when larger than 1000 there is an error.
Here is the error info:

KeyError                                  Traceback (most recent call last)
<ipython-input-12-3ba183898746> in <module>()
      6     downloaded_dest = (downloaded_path/o)
      7     downloaded_dest.mkdir(exist_ok = True, parents = True)
----> 8     download_images(downloaded_dest, urls = search_images(f"{o} photo", max_images=1000))
      9     resize_images(downloaded_path/o, max_size = 400, dest = resized_path/o)
     10 path = resized_path

<ipython-input-2-0d0938f9c459> in search_images(term, max_images)
     10         data = urljson(requestUrl,data=params)
     11         urls.update(L(data['results']).itemgot('image'))
---> 12         requestUrl = url + data['next']
     13         time.sleep(0.2)
     14     return L(urls)[:max_images]

KeyError: 'next

Here is my code:

searches = 'female','male'
downloaded_path = Path('downloaded_female_or_not')
resized_path = Path("resized_female_or_not")
for o in searches:
    downloaded_dest = (downloaded_path/o)
    downloaded_dest.mkdir(exist_ok = True, parents = True)
    download_images(downloaded_dest, urls = search_images(f"{o} photo", max_images=500))
    resize_images(downloaded_path/o, max_size = 400, dest = resized_path/o)
path = resized_path

# get all imgs
searches = 'female','male'
downloaded_path = Path('downloaded_female_or_not')
resized_path = Path("resized_female_or_not")
for o in searches:
    downloaded_dest = (downloaded_path/o)
    downloaded_dest.mkdir(exist_ok = True, parents = True)
    download_images(downloaded_dest, urls = search_images(f"{o} photo", max_images=1000))
    resize_images(downloaded_path/o, max_size = 400, dest = resized_path/o)
path = resized_path
1 Like

When I tried using fast setup and run./setup-conda.sh, I got curl: option --no-progress-meter: is unknown err. I searched online but seems that no answer regarding this, anyone know how to solve it?

fastsetup % ./setup-conda.sh 
Downloading installer...
curl: option --no-progress-meter: is unknown
curl: try 'curl --help' or 'curl --manual' for more information

Just remove that option from the script ā€“ sounds like you have an old version of curl.

Hi, I just finished my first homework and I have two questions:

  1. Where am I supposed to ā€œshareā€ it? I canā€™t seem to find the proper discussion.
  2. Whatā€™s the purpose of the ā€œprobsā€ output from learn.predict ? I assumed it was the probabilities of the input belonging to one class, but after some testing, they seem uncorrelated (e.g. with my classifier an image is classified as ā€œclass 1ā€ but the highest probs is for ā€œclass 3ā€). Any advice?

Hi @nerusskyhigh, welcome to fastai!

  1. This is the discussion post you are most likely looking for: Share your work here āœ…

  2. The article pasted below does a good job of explaining the output of learn.predict, I have copy and pasted the relevant part here:
    ā€œa fully decoded prediction including reversing transforms from the dataloader, a decoded prediction using decodes , and the prediction from the model passed through the loss functionā€™s activationā€

Which leads to some questions in regards to what your model is doing, is it a classification task? Can you please paste the output here?

3 Likes

Thank you!
For those who are facing the same problem:
I removed --no-progress-meter option
from curl -LO --no-progress-meter $DOWNLOAD in ./setup-conda.sh in fastsetup folder
and get curl -LO $DOWNLOAD instead.

And it worked!

3 Likes

Hi, first thanks for the links and the advice. Iā€™ll post my work there when it is completed.

Here is a screenshot of the output. I wanted to try and distinguish between Magic, PokĆ©mon, and Yugioh cards with a classifier. You can see that the game is correctly predicted but the biggest output is assigned to magic cards. Furthermore, I just notice that those can not be probabilities because they donā€™t add up to 1 but I guess the conversion is just a normalization that will preserve the order relation.

If you need it, here is the complete notebook.

2 Likes

Hi @nerusskyhigh, thanks for sharing the notebook and itā€™s a cool idea and classification task!

I was able to decipher what the issue is. If you were to run dls.vocab on your data loader ā€“ once itā€™s been instantiated ā€“ you can see that there are more than three classes. So while the three classes will not add up 1.0 as expected, if you were to run probs.sum() (after the cell youā€™ve pasted), you will see the probabilities adding up to 1.0, as expected!

Iā€™ll leave it up to you to determine how to figure out how to rectify the path naming issue, but I hope that this helps and it was clear enough for you to understand your question in regards to the learn.predict.

2 Likes

The ā€œdls.vocabā€ made the trick, I thought there wasnā€™t any real way yo check the class found other than using the verbose option (which is not completely clear). Once I noticed the other classes I looked for other folders inside of ā€œcards/ā€ and worked from there. I donā€™t really know where they came from, but I settle things once and for all by sanitizing the cardsā€™ names. Thanks a gain for your time. If there is a proper way to thank you other than the heart let me know!

1 Like

What is the reason for resizing images to be 192 x 192?

item_tfms=[Resize(192, method='squish')]

My understanding is resnet18 is trained with images of a size 224 x 224. Would that size perform better?

Try it and see! Smaller images are faster, so see what the time vs performance trade-off looks like.

Try it and see!

Great idea! I gave it a shot to see if resnet18 performs better with 192 pixel images or 224 pixel images.

Sized to 192:
image

Sized to 224:
image

Iā€™m not sure the results are meaningful because of the small dataset size for the birds vs. forest problem. So I might need to try with some other datasets.

1 Like

Iā€™m having trouble running the search_images function, i did ran it couple of times before in the afternoon, but when I ran it again in the night, it has this http error 403. Iā€™m thinking that thereā€™s some sort of API limit for urlread(), is this accurate or could there be another limitation that Iā€™m not aware of? Any help or pointers would be appreciated.


7 Likes