Lesson 2 official topic

Akindele · February 26, 2023, 10:52am

Hey folks, I want to know what the difference is between this:

faces = DataBlock(
    blocks=(ImageBlock, CategoryBlock), # we are dealing with categorical classification
    get_items=get_image_files, # function to fetch images from our dataset
    splitter=RandomSplitter(valid_pct=0.2, seed=42), # spliting our dataset
    get_y=parent_label, # class of category
    item_tfms=[Resize(192, method='squish')] # more like BoxFit.contain, from Flutter
)
dls = faces.dataloaders(path, bs=32).show_batch(max_n=9)

and this:

faces = faces.new(item_tfms=Resize(192, ResizeMethod.Pad, pad_mode='zeros'))
dls = faces.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)

Specifically, the new keyword in the “faces” data block?

We created a datablock initially without the new keyword and now because we want to change the image appearance we ignored the other properties of the datablock and only passed the item_tfms params? Why is this so? Does the new “faces datablock” retain the properties of the first created datablock?

zhhisdn · February 26, 2023, 1:52pm

Thanks, that’s exactly what I was looking for. Thank you Lucas.

medoab · February 26, 2023, 4:18pm

Hello,

I have two different folders ( annotations and images) which contain 120 images in the images folder and the corresponding 120 annotations in an annotations folder. The format of the annotation is in “json”. I want to create a datablock so as to perform classification analysis. Please, how can I map my JSON files to the images so as to run the classification analysis?

lucasvw · February 26, 2023, 7:06pm

Yes, there are many ways to do this. The details depends a bit on exactly how the data is structured, e.g. how the images are mapped to the entries in the json file.

If you are familiar with using pandas and especially referencing data in pandas dataframes as part of the datablocks API, you could read in your json files, combine them, put them into a single pandas dataframes and use that.

medoab · February 26, 2023, 8:43pm

the “json” are structured in this way:

{
  "instances": [
    {
      "corners": [
        {"x": 181.98136043148975, "y": 180.87500292991302},
        {"x": 80.99062551749896, "y": 176.94569670353974},
        {"x": 78.42819076221744, "y": 294.24826550087346},
        {"x": 183.0432587489509, "y": 294.49812289825013}
      ],
      "center": {"x": 131.1108588650393, "y": 236.64177200814407},
      "id": "0d9307ac-0a57-48e9-897f-8d782e844509",
      "defected_modules": true
    },
    ...
  ]
}

lucasvw · February 26, 2023, 9:46pm

This looks more like object detection data, whereas in your question you mention “classification”.

Classification is finding the correct label for an image such as “cat”, “dog” or “horse”. Object detection is about finding things in images, classify them and also draw a bounding box around them.

What are you exactly trying to do?

medoab · February 26, 2023, 10:38pm

I am working on this dataset to detect defected PV modules

I have challenges of creating an annotation on the image using the “json” files.

lucasvw · February 27, 2023, 7:19am

Did you already have a look at this or this. Especially the second link is based on outdated code (course18), so not everything might work exactly as presented there, but it might be a good start and show you in general how to achieve this.

medoab · February 27, 2023, 9:36am

Thanks, I will check those links

medoab · February 27, 2023, 3:32pm

@lucasvw thanks for the information. The videos are really helpful. Please do you have the notebook’s link to the second video?

montanoj14 · February 28, 2023, 1:30pm

Hey @Akindele I believe the new keyword is reassigning a new image transformation to the set of images you’re preprocessing. I have yet to test this with my own data but if you look at the new method and the _merge_tfms method in fastAI’s Github repo it looks like it’s passing the values you pass into the new method back int othe item_tfms parameter.

github.com

fastai/fastai/blob/master/fastai/data/block.py

# AUTOGENERATED! DO NOT EDIT! File to edit: ../../nbs/06_data.block.ipynb.

# %% ../../nbs/06_data.block.ipynb 2
from __future__ import annotations
from ..torch_basics import *
from .core import *
from .load import *
from .external import *
from .transforms import *

# %% auto 0
__all__ = ['TransformBlock', 'CategoryBlock', 'MultiCategoryBlock', 'RegressionBlock', 'DataBlock']

# %% ../../nbs/06_data.block.ipynb 6
class TransformBlock():
    "A basic wrapper that links defaults transforms for the data block API"
    def __init__(self, 
        type_tfms:list=None, # One or more `Transform`s
        item_tfms:list=None, # `ItemTransform`s, applied on an item
        batch_tfms:list=None, # `Transform`s or `RandTransform`s, applied by batch

This file has been truncated. show original

Yuvraj · February 28, 2023, 4:00pm

Hi @benkarr,
I am a new student of this course, and I am having problems with the ImageClassifierCleaner.

I tried to run the delete and change cell, for just one selection of black bear from the train set, as you told to… if I am not wrong.
However I am not sure… why do I get this error.

Is there any way to solve it…

If anyone else has solved the problem, kindly let me know.
Thanks in advance.

gmjn · February 28, 2023, 4:36pm

The error says that there is already an image with that file name. You are moving images from one folder to another with the cleaner (Change image category.). If you already ran this cell or notebook before the images you changed category earlier are still in that new folder.

Yuvraj · February 28, 2023, 4:49pm

Thank you for the reply, I think I got it where I was mistaken.
I was actually changing the labels for the images, which were already correct. I think that is why I got the error that the file exists already.

So, with image cleaner, I believe we only need to change those files which seem to be incorrect with the label or needs to be deleted. Rest <> label can stay.

N Once we are done cleaning the data I think we need to re-run the data loaders and train the model.

Also, I believe we need to run the cells after each pair of category and train/valid set.

for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change():shutil.move(str(cleaner.fns[idx]), path/cat)

zhhisdn · March 1, 2023, 5:49am

Anybody encountered this error while running this in the course:
pred,pred_idx,probs = learn_inf.predict(img)
?

Thanks!

AttributeError Traceback (most recent call last)
in
----> 1 pred,pred_idx,probs = learn_inf.predict(img)

26 frames
/usr/local/lib/python3.8/dist-packages/fastai/learner.py in predict(self, item, rm_type_tfms, with_input)
319 def predict(self, item, rm_type_tfms=None, with_input=False):
320 dl = self.dls.test_dl([item], rm_type_tfms=rm_type_tfms, num_workers=0)
→ 321 inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
322 i = getattr(self.dls, ‘n_inp’, -1)
323 inp = (inp,) if i==1 else tuplify(inp)
…

/usr/local/lib/python3.8/dist-packages/PIL/Image.py in open(fp, mode, formats)
2982 exclusive_fp = True
2983
→ 2984 prefix = fp.read(16)
2985
2986 preinit()

/usr/local/lib/python3.8/dist-packages/PIL/Image.py in getattr(self, name)
544 )
545 return self._category
→ 546 raise AttributeError(name)
547
548 @property

AttributeError: read

zerotosingularity · March 1, 2023, 11:45am

Instead of creating a PIL, you can just pass the filename. PILImages are no longer supported in learn.predict() in v2.7.11:

annie_emo · March 3, 2023, 9:58pm

Thank you so much, it worked!!

pors · March 4, 2023, 9:52am

A bit off topic, but maybe some will find this interesting: I made an account on landing.ai and uploaded the bear images to see how their training went. The results are equal (although their training took way longer):

They have built a very nice interface on top of their tools, making it even more accessible for non-techies.

Might be a cool project to turn fast.ai as well into a SaaS. I know quite a bit about building a SaaS platform, but I’m just a newbie in AI (it’s lesson two after all ). If anyone is interested to explore this, please let me know.

pors · March 11, 2023, 4:14pm

Is it just me? All examples on Home | Jeremy Howard (tinypets) are broken.

Did something change with the hf API?

zhhisdn · March 12, 2023, 2:01am

Why is all the output in the book has two outputs? first output is always epoch0, and the second output depends on how many epochs you trained, if you train 1, the second output will just be one line about epoch0. if you train 5, the second output will be five lines from epoch0 to epock4. I’m really confused, isn’t there supposed to be just one output?

Thanks.