Lesson 2 official topic

does anyone know if the ‘Dogs v Cats’ app.ipynb file that Jeremy uses in the video is available anywhere? I’d like to replicate what he did in the video.

Do you mean this one?

1 Like

that looks very similar to the one Jeremy uses in the video. I’ll see if it contains what I need. thanks a lot!

1 Like

this helped me a lot - thank you!

in Learning 3: Exporting the app.py file in your Github post you say

Note: Additonally the export only works it the notebook is in a folder called nbs.

Why is that? It seems slightly hacky and I’m wondering if there is a different solution.

I have changed this

import notebook2script from nbdev.export
notebook2script('app.ipynb')

to this

from nbdev import nbdev_export
nbdev_export('app.ipynb')

as you suggested. When I run I get the error InterpolationMissingOptionError: Bad value substitution: option 'lib_name' in section 'DEFAULT' contains an interpolation key 'repo' which is not a valid option name. Raw value: '%(repo)s'. I though it might be related to this so I tried creating a settings.ini file containing this

[DEFAULT]
doc_baseurl = /

but I still get the same error.

That’s a helpful tip, thanks.

Hi David,

Indeed, I also encountered that error in another notebook, but despite the error, it still did what it was supposed to do: Inside the nbs-folder it created a new folder which has he name of my repo (“FastAI2022”), and inside the app.py was created. → not nice, but it worked

Reflecting on this, in the previous notebook, when I was writing the post, I actually did more nbdev-setup: As suggested here, I ran the nbdev-migration. Looking up my bash history:

Installation

mamba install -c fastchan nbdev
mamba install -c fastai nbdev

nbdev-migration

find . -name '*.ipynb' -exec perl -pi -e 's/#\|\s*hide_input/#| echo: false/' {} +
find . -name '*.ipynb' -exec perl -pi -e 's/#\|\s*hide_output/#| output: false/' {} +
find . -name '*.ipynb' -exec perl -pi -e 's/#\|\s*skip/#| eval: false/' {} +
find . -name '*.ipynb' -exec perl -pi -e 's/from nbdev.export import notebook2script/from nbdev import nbdev_export/' {} +
find . -name '*.ipynb' -exec perl -pi -e 's/notebook2script/nbdev_export/' {} +

nbdev_setup:

nbdev_install_quarto
nbdev_new

I hope this helps.

1 Like

nbdev_export is to export contents of a folder that is initialized as a nbdev project (with nbdev_new as @chrwittm suggests) and expects certain settings if not specified differently. If you just want to export a single notebook you can use:

import nbdev
nbdev.export.nb_export('notebook-name.ipynb', 'path/where/to/save')

which @ThomasAlibert also described above :slight_smile:

Edit: nbdev_export to export a (nbdev) project; nb_export to export a (n)ote(b)ook

8 Likes

Im building a sport classifier and the best error rate is 13% if i go for more epochs its getting worser,removed the duplicates from the dataset. Is there a way to improvise the predictions or the error rate?
Thanks in advance

@chrwittm @benkarr thanks both!

I used nbdev.export.nb_export('notebook-name.ipynb', 'path/where/to/save') as Ben suggested and it seems to have worked :smiley: thanks so much :pray:

4 Likes

The Colab notebook for lesson 2 uses Bing search.

Jeremy recommends that we replace Bing with DDG search (… and he explains why in the video for the course).
.

But when it’s replaced, as other people have noted above, the code doesn’t download any images.

Screen Shot 2022-10-15 at 11.06.30 AM

So it seems like something else is required in addition to replacing Bing with DDG to fix this.

Any suggestions?

I might be able to guess the answer, but rather than give you the whole of it, I think you’ll gain more if I can lead you to discover it yourself.

You’ve reported a problem with a particular cell, but is that cell the cause of the problem, or just a subsequent symptom? It helps if you can identify the cell where things first go wrong.

The following shows the result of the original search test…

and immediately below that it implies the variable ims is expected to contain an array of urls (even though here for test pruposes its hardcoded to contain just one)…

Making the minimal naive change, you might get the following…
image

So it seems the “200” indicates it was all successful, but given there is a problem somewhere, you should verify the actual contents of each variable, because that shows…

which doesn’t look like the array of urls that is required.

Since ims derives its values from results, we examine that next…

Hmmm… they look like the urls that imgs is expected to contain.

2 Likes

Thanks Ben, appreciate your response!

I played with ims and results and that was helpful. I learned something.

Ultimately, I got an Azure search key, which I hesitated to do at first based on the recommendation to go with DDG, and they ask for a credit card for a free account (which is 3 calls per second; 1,000 calls per month).

Also, there were some posts from the earlier course that Azure had changed its search API and that made it buggy as well.

But, I followed the quick instructions in this post:

from the “Setup the Azure Account” section.

It was super easy to add my API key and then everything worked as expected.

Awesome. Do you notice then, the difference between Bing and DDG when running the following code:

  • type(results[0])

and also scanning for attrgot the output of the following for each of Bing and DDG:

  • dir(results[0])

Which leads to discover the fix for DDG is:

results = search_images_ddg('grizzly bear')
ims = results
len(ims)
3 Likes

Interesting, I have both Bing and DDG searches side by side (… with DDG using the original results code, without the fix you provide above).

For Bing:

  • Interrogating the results with “ims” shows me 150 “pure” URLs.

  • Interrogating the results with “results” shows me the 150 URLs with a lot of other metadata.

  • The type(results[0]) for the Bing search is “dict”.

For DDG (again, without your fix):

  • Interrogating the results with ims shows: (#200) [None, None None …]

  • And interrogating the results with “results” shows 200 “pure” URLs.

  • The type(results[0]) for the DDG search is “str”.

What I gather from that comparison is Bing is providing the results as a dict or dictionary expression of information and so the line of code for Bing below is needed so that ims uses only the URL from the search results - this one:

ims = results.attrgot('contentUrl')

Whereas with DDG the results are only a str or string of URLs (no metadata) and so, as you say above for your fix, ims = results.

On dir[results[0]) that you also mentioned, I see that the directories are slightly different for Bing and DDG, but I don’t understand the implications.

Your usage of “directories” sounds like files & folders on disk. That is not what that is.
Python “dir” returns the variables and functions defined by the object.

but I don’t understand the implications.

You don’t list any “dir” information you looked up, so it helps to look up things that are not clear:
Searching for: python dir
brings up a few useful things…

So the implication is, that the attrgot() function is not defined for strings, which is why null was being returned.

3 Likes

Very helpful, thanks Ben!

I made a classifier to help with sorting out 80,000 family photos. I wanted to separate the “good” ones from various screenshots, forms, horrendously blurry photos, etc., so that I can run a slideshow with only the good ones. It took a while with some technical difficulties, but worked out well in the end with around 99% accuracy, and I did complete the task of sorting out all the photos.

Notebook:
https://drive.google.com/file/d/1jto7kRIMEo2yIBrWlooKoo89Z-ll3vIr/view?usp=sharing

Demo web app:

My classify helper script:
https://sam.ucm.dev/t/bin.ai/classify

The model probably isn’t generally useful, it’s been trained to our particular messy photography and screenshot habits.

Why am I exactly one “part” behind in the fast.ai course, completing lesson 2 while I should be doing lesson 10??? I don’t know! :slight_smile: but I’ll cope, and I’m happy to be making progress. So many technical difficulties!

3 Likes

YouTube needs an annotation feature to add “errata” like this mid-video. I guess we could use the title cards or whatever if that’s still a thing they support.

I haven’t found anything in YouTube that seems to support that, unfortunately.

Check this out, this is the one I used for duckduckgo.

Here is a sample of how I implemented the scraper mentioned above

1 Like