Lesson 2 In-Class Discussion ✅


(Mahesh Khatri) #612

Sorry. I did not see that you had already shared the same resource. Thanks.


(Mahesh Khatri) #613

If your training dataset images are of PNG format, they need to be converted to JPG.


(José Fernández Portal) #614

This should work:

import PIL
for fn in PATH.glob('*/*.jpg'):
    im = PIL.Image.open(fn)
    if im.mode != 'RGB': im.convert('RGB').save(fn)

(Mahesh Khatri) #615

This may help.


#616

Apologies!


(Mahesh Khatri) #617

As much as you can realistically collect. Fastai has been remarkably powerful even with a few data samples. As long as the images are of real world objects which are not very different from the images used to train Imagenet. Hope this helps.


(Mahesh Khatri) #618

Fastai has been known to work well even with a few images. See the previous answer to @whatrocks similar question.


(noskill) #619

I was trying download_images for some pictures and got following errors -

  1. “Error: [content-length]”:
  2. The progress bar used to hang up around 99%.
    Upon checking download_images method and its submethod,i got to know that content-length is being used in download_url for the length. Error might be there. I am investigating into this further meanwhile can anybody explain thee reason behind these error??
    Also download_images works fine with single worker i.e. progress_bar completes 100% . Might be a bug or i might be doing something wrong.

(James Juan Whei Tan) #620

indeed, that would be a lovely addition


(Jeremy Howard (Admin)) #621

@simonw asked the question before I got to that in the video FYI.

Having said that, please be sure that your answers are helpful to the person asking the question. i.e. instead of just saying “it’s in the video”, try providing a link to the relevant time-stamp, or just answer the question yourself.


#622

I’ve been having problems regarding using custom models with create_cnn.

I am getting this error

I found that in this body = create_body(arch(pretrained), ifnone(cut,meta['cut'])), create_body takes pretrained(bool value) as input here and doesn’t pass a Tensor in this case.
While this method works with resnet (models.resnet18(True)), ResNet defined in models takes only pretrained bool value as input (pretrained=True,**kwargs). it may not be applicable with custom models always. Models usually takes (input,**kwargs)``#input Tensor

What can be the fix ?


General course chat
(Miguel Perez Michaus) #623

File Deleter is a very cool tool but, as others comment here apparently by now it can not clean train set, only validation set.

I think to address label noise issue train cleaning is at least as important as validation cleaning. Otherwise we are just cleaning noise in validation set what is ok but will make subsequent -certain- validation improvement only due to cleaning. In other words, we are helping the model by a more reliable validation but not a more reliable training set.

A hacky workaround to clean all data is possible with the tool in its current state: Before doing “real” training you can set a bigger validation ratio, say 0.5 and run the model + cleaning tool three/four times, with different random seeds each time for validation split. After that you can do real training with an -almost- completely clean dataset.


(Francisco Ingham) #624

We are working on this. For the moment you can use this function:

def get_toploss_paths(md, ds, dl, loss_func, n_imgs=None): if not n_imgs: n_imgs = len(dl) val_losses = get_preds(md, dl, loss_func=loss_func)[2] losses,idxs = torch.topk(val_losses, n_imgs) return ds.x[idxs]

Where you can either feed in Training or Validation Dataset and Dataloader. For the lesson 2 notebook you can call it like this:

train_toploss_fns = get_toploss_paths(learn.model, data.train_ds, data.train_dl, learn.loss_func)

You can then feed train_toploss_fns to FileDeleter.

We are also working on showing and being able to change the labels in the widget.


Applying ClassificationInterpretation methods on the training set
(Stefano) #625

I’ve extended the sgd notebook of lesson 2 to polynomial fitting:

That wasn’t easy and I was stuck with an error on the “update” function…
But solving it, I 've got a better understanding of PyTorch mechanics: in particular the difference between constant tensors (without gradient) and parameters (differentiable - with “gradient”).


(Dien Hoa TRUONG) #626

Can someone suggest me a tutorial for developing web app please ? :smiley: , I don’t understand clearly the instruction I find in Scarlette web site. Thank you in advance.


(Harold) #627

I’m also having trouble install starlette from their instructions:

$ pip3 install starlette
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPConnection object at 0x10eeb0630>, 'Connection to 52.39.238.16 timed out. (connect timeout=20.0)')': /repository/pypi-all/simple/starlette/
 Could not find a version that satisfies the requirement starlette (from versions: )
No matching distribution found for starlette

I would suggest using a different web app with better documentation, maybe Flask.

Here’s a tutorial.


(Michal Wawrzyniuk) #628

I’m not sure which part of web app you would like to know better but there are more tutorials about flask which i fairly similar to starllet
you can try those ones

M


(Dien Hoa TRUONG) #629

Thank you @Michal_w, @astronomy88 . So I switch to Flask now. Maybe I will comeback to Starlette later but at this moment, it’s hard to undertand from their instructions.


(Mahesh Khatri) #630

Yes.

Thanks.


(Lankinen) #631

Is there web app tutorials written by students? It might be easier to understand when they use fastai.