Lesson 2 official topic

Hi - for anyone else experiencing problems trying to run the code in the Lesson 2 notebook “02_production.ipynb”, the below fix worked for me as of 15 February 2024.

It borrows the method demonstrated in the Lesson 1 “Is it a bird? Creating a model from your own data” notebook.

There is also an issue open for this on GitHub here.

Hope it helps.

#hide
! [ -e /content ] && pip install -Uqq fastbook
import fastbook
fastbook.setup_book()

#hide
from fastbook import *
from fastai.vision.widgets import *
# Add below import (based on Is It A Bird? notebook)
from fastdownload import download_url

# Replaced search_images_bing with DuckDuckGo
search_images_ddg

# Use function definition from "Is it a bird?" notebook
def search_images(term, max_images=30):
    print(f"Searching for '{term}'")
    return L(search_images_ddg(term, max_images=max_images))

results = search_images_ddg('grizzly bear')
ims = results.attrgot('contentUrl')
len(ims)

#hide
ims = ['http://3.bp.blogspot.com/-S1scRCkI3vY/UHzV2kucsPI/AAAAAAAAA-k/YQ5UzHEm9Ss/s1600/Grizzly%2BBear%2BWildlife.jpg']

dest = 'images/grizzly.jpg'
download_url(ims[0], dest)

bear_types = 'grizzly','black','teddy'
path = Path('bears')

from time import sleep

for o in bear_types:
    dest = (path/o)
    dest.mkdir(exist_ok=True, parents=True)
    # results = search_images(f'{o} bear')
    download_images(dest, urls=search_images(f'{o} bear'))
    sleep(5)  # Pause between bear_types searches to avoid over-loading server

fns = get_image_files(path)
fns

len(fns)

failed = verify_images(fns)
failed

failed.map(Path.unlink);

bears = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128))

dls = bears.dataloaders(path)

dls.valid.show_batch(max_n=4, nrows=1)

bears = bears.new(item_tfms=Resize(128, ResizeMethod.Squish))
dls = bears.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)

bears = bears.new(item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros'))
dls = bears.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)

bears = bears.new(item_tfms=RandomResizedCrop(128, min_scale=0.3))
dls = bears.dataloaders(path)
dls.train.show_batch(max_n=4, nrows=1, unique=True)

bears = bears.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls = bears.dataloaders(path)
dls.train.show_batch(max_n=8, nrows=2, unique=True)

bears = bears.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
dls = bears.dataloaders(path)

learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

interp.plot_top_losses(5, nrows=1)

#hide_output
cleaner = ImageClassifierCleaner(learn)
cleaner

#hide
for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)
1 Like

I’m on the book’s lesson 02 colab notebook and as the course teacher mentioned, at the first beginning of the coding section, the book asks of us to set up an azure account etc, however he replaces all that with a call to search_images_ddg function, which i do as well but when i execute the cell I’m given the error below


what is it that im missing? couldnt find this same issue in this reply section so I’m asking here first time

when i run your code on the top of the colab notebook, it asks me permission to access my google drive files, why is that?

I think at the time the notebook was written, using Bing search was the best option, and for that you needed an Azure API key.

But now DuckDuckGo image search is recommended - presumably because it’s free and no account is needed.

I didn’t bother with Azure, went straight with DDG.

In your code that error says image_search_ddg isn’t defined - you may need to look for image_search_bing in your code and change that to end in ddg.

Regarding the Google Drive access message: I get that too, presumably because there are paths in the code to store links and images, so it needs drive access.

If you’re uncomfortable with giving access, you could try opening the notebook on Kaggle instead.

yea, that’s exacly what i’ve done. simply changed search_images_bing to search_images_ddg. there are 2 code cells above that regard obtaining the azure key & exporting it as an environment variable but i haven’t bothered executing that & shouldn’t be necessary, however the error still persists. I’m using google chrome and there aren’t any other code cells to execute above other than those 2 i just mentioned.

in your colab u simply changed the ending word to ddg and your code ran, without executing nothing prior to it?

Ah. Yes you will need to execute the code from start to finish, there are multiple changes in the code provided compared with the original version, including additional library imports and the image_search function from the other notebook in the description.

in that case, the “search_images_ddg” code statement is the first executable in the lesson_02 notebook, you’re telling me i have to run code from previous notebooks? like the one from lesson_01? maybe i could also grab the import statements from the lesson_01 notebook and execute those in the lesson_02 notebook, or any other statements that could be necessary for proper lesson_02 notebook execution.

did you run this second notebook shortly after the first one?

i see that you pasted above a pretty big code snippet, since it worked for you do i grab the entire code and paste it in a single cell inside the notebook?

Yes, in my posts above I’m referring to the entire code snippet I shared.

It replaces the 02_production.ipynb notebook code from the start up until the end of the section about augmenting the images.

I shared it as one post for convenience, you can divide it into cells as you see fit so that the outputs match the intended outputs of the original notebook.

i think i got it working somehow, at least for now. apparently this google drive access msg also popped up during the first lesson, i denied and was able to run through the rest of the code succesfully, and now im doign the same thing. will use your code snippet and substitute it into the places where it must go, thanks a lot :grinning:

because you will be downloading images and other stuff so it need space to store those!

dls.valid.show_batch(max_n=5, nrows=1)
this works fine but when I do for train:
like this:
dls.train.show_batch(max_n=5, nrows=1)

I gets:


ValueError Traceback (most recent call last)
in <cell line: 3>()
1 data_seq = data_seq.new(item_tfms=RandomResizedCrop(128, min_scale=0.3))
2 dls = data_seq.dataloaders(path)
----> 3 dls.train.show_batch(max_n=4, nrows=1, unique=True)

1 frames
/usr/local/lib/python3.10/dist-packages/fastai/data/load.py in one_batch(self)
184 def to(self, device): self.device = device
185 def one_batch(self):
→ 186 if self.n is not None and len(self)==0: raise ValueError(f’This DataLoader does not contain any batches’)
187 with self.fake_l.no_multiproc(): res = first(self)
188 if hasattr(self, ‘it’): delattr(self, ‘it’)

ValueError: This DataLoader does not contain any batches

anyone suggest me what’s the issue here!!

couldd anyone help me to find mistake herehttps://colab.research.google.com/drive/15C7UKHNQ2aRQuB7NvmRQOWRqqBbZJrQP?usp=sharing

i have been able to perfectly run the colab notebooks without giving access to my drive. the data is simply stored in the runtime server instance, no problem

through the colab notebooks (based on the fastai course book) contain references to other articles/blogs/sections in between <> but no text is shown in between. for example, check out the last paragraph right before the “questionaire” section in the notebook 02_production which says:

We’ve provided full details on how to set up a blog in <>. If you don’t have a blog already, take a look at that now, because we’ve got a really great approach set up for you to start blogging for free, with no ads—and you can even use Jupyter Notebook!

those “<>” in the text, seem to appear a lot throughout these notebooks, but it doesnt seem to be normal, what could be the problem? anybody know?

how you able to do so, could you tell

are you on chapter 2/3?
if so can we team up on Discord that will help to push to complete the course.

i simply commented or deleted out the line of code where it runs fastbook.setup_book(), apparently its not necessary to run the rest of the colab.

also i like your discord proposal, my account is HASBULLAMBALA#5802 if u wanna add me

1 Like


when I check for validation batch it looks fine, give expected data but when I look for train dataset it says there is no batch!!
Like below:

what I am missing?

my data block also seems fine:
face = DataBlock(
blocks = (ImageBlock, CategoryBlock), #type of data
get_items = get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y = parent_label,
item_tfms = Resize(128)
)

I did check by calling data from train dataset, but no luck:

but there is data in train dataset, , not sure what it is failing:


after too much efforts I could not found the issue:
but I resolve it, not sure why this is the issue.

previously I am using 2 values in human_lst, male and female, but when I add 3rd one it got fixed!!

can someone answer @vishutanwar question I’m running into the same problem