Lesson 1 In-Class Discussion ✅

Is there any example available regarding using Custom head on top of ResNet50 with ConvLearner. It’s not available in docs. I have 10M (approx) images, and Resnet has reached saturation. I wanted to increase number of parameters. Also as far as docs says, there’s support for darknet and wide resnet including Unet which may not serve the purpose here.

1 Like

I agree with the neural net here: there is no way this is not a beagle :slight_smile:
Anyone knows of any systematic ways to figure out which data are mislabeled?

1 Like

Yes, try this:

interp = ClassificationInterpretation.from_learner(learn)
# show most confused images (with min cut off = 2)
interp.most_confused(min_val=2)
# or plot the full confusion matrix:
interp.plot_confusion_matrix(figsize=(17,12), dpi=60)
1 Like

I’m also getting the same error. Have you found a solution yet?

Hi team,

After going through lesson1, I tried to train a model with a different image dataset(http://ai.stanford.edu/~acoates/stl10/stl10_binary.tar.gz), But I am facing some issues while using the untar_data function

KeyError: 'content-length’

Also, on using a different dataset(http://www.cs.utoronto.ca/~kriz/cifar-10-python.tar.gz), I am getting the below error on using the untar_data function

OSError: Not a gzipped file (b’<!’)

Any help would be appreciated

In a conda environment:
conda install nb_conda_kernels
resolved this for me @pandeyanil. I then saw all my kernels.

1 Like

It seems that you put too much data on your GPU RAM.

Try to check with the bash command nvidia-smi (or I like to use gpustat) before and after the operation causing the CUDA memory error if this is the case.
You can call it directly from your jupyter notebook with !nvidia-smi (with “!” you can run bash commands in a jupyter notebook).

Also restart the notebook kernel (shortcut Esc+00) and if this does not help your entire system.

If it is the case, try decreasing the bs and/or the image size.

I hope this helps. :slight_smile:

@prajjwal i think jeremy will cover that in the coming lessions. However if u want to give it a shot, refer to this link in the documentation

Is there a way to quickly spot the possible errors in the labels of the validation set data? (Similarly also errors in the training data, though that should be less of a problem.) The mislabeled validation data are more likely to be misclassified by any deep network that predicts them, so maybe one could look at the intersection of the wrong predictions of different models, especially in the intersection of the confident predictions that are wrong. Does anyone have any links to such literature, blogs, or past fastai threads?

2 Likes

This helps. It worked when I changed the image size. Thanks a lot.

can you give a brief explanation of what its trying to do?

Have a look at this question which focuses on semantic segmentation.

1 Like

So Guys, Jeremy has mentioned about issues with images being rectangular and not square. What is it the exact issue that happens if the images do not have a X * X resolution and have X * Y resolution ?

1 Like

Hi Daniele,

Usually the first parameter of DataBunch constructors (“path”) refers to DataBunch root folder.

Example:
data = ImageDataBunch.from_lists(path=pathToRoot, fnames=files, labels=labels, valid_pct=0.2, test='test', ds_tfms=tfms, bs=32)

In all the sample notebooks, the path folder contains: both training data and the saved models (in “models” subfolder).
For example:

  • learn.save('last') will write the file “last.pth” inside the “models” subfolder in path
  • learn.load('last') will read it

NB: usually fast.ai samples has this “root folder” inside ~/.fastai/data
(ie: ~/.fastai/data/mnist_sample)

Link to working sample:

1 Like

I did, that’s why I asked for example that’s proven to work better when added as head.

Sure! In the first lesson the architecture (=model parameter) has been loaded from prebuilt one:

learn = ConvLearner(data, models.resnet50, metrics=error_rate)

But you can define you own using fast.ai (https://docs.fast.ai/layers.html#layers) in a way very similar to Keras:

model = nn.Sequential(
    nn.Conv2d(3,  16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

For what I’ve understand fastai library wraps and simplify PyTorch as Keras wraps and simplify Tensorflow.

Says that, I think that pretrained Resnet34 model is a good starting point to address the bioimaging dataset you’ve proposed.
Of course, you have to unfreeze almost all layers (probably I would keep the first 3 to 15 layers frozen as starting point), and train with A LOT images, normalizing on new dataset and not relying on imagenet normalization.

1 Like

I’m playing with some data from kaggle whose evaluation is based on macro F1 score. Do we have a f1_score metric in fastai? I’ve tried f1_score from sklearn.metrics but it doesn’t seem to work

Sure!

http://docs.fast.ai/metrics.html#Fbeta

3 Likes

Thank you!

I downloaded images (my dataset) from google images following @lesscomfortable 's wonderful jupyter notebook. I am using AWS Sagemaker as my platform. I got a 66% error rate, which is bad :frowning: . On looking at the images (show_batch), I found some irrelevant images. How do I remove irrelevant images from my dataset?

I have 6 classes and 300 images in each class. Should I open each image and verify if they are relevant or not? Is it possible to download the dataset from sagemaker to my laptop, so that I can quickly delete irrelevant images?

2 Likes