Lesson 2 discussion - beginner

Dumb question - at 1:35:00 ish you say that most modern architectures can handle images of various sizes.

I was under the impression that in order to feed forward an image through a network, it has to be of a consistent size. In other words, whatever architecture you have, the number of inputs has to match the number of pixels.

So if you have a CNN with 400 input nodes, then your images can only be 20x20 (or something that multiplies to be 400). Am I missing something?

How can you create one architecture that accepts images of varying sizes?

Hi everyone, I tried running the code for the dog breeds competition on another kaggle competition


but when i try fitting the model i get this weird error:

I tried reducing the batch size but to no avail.

I got the same:
IndexError: arrays used as indices must be of integer (or boolean) type

But there’s no tmp folder. I set ‘val_idxs’ from a csv file.
can anyone suggest me how to fix it?

My label column is formatted as [‘36’, ‘19’, ‘66’, ‘153’, ‘164’, ‘76’, ‘42’]. I am not sure what’s the correct format for label column. do I need to format it as our csv file from planet: haze primary?

thank you!

I was trying to reproduce the dog breed identification challenge.
Using AWS instance p2.xlarge. Downloaded the data set from kaggle.

Used learn.TTA() to predict the labels of the validation set.
learn.TTA() returns 2 values : i) log_preds (log probabilities ) ii) true labels

Strangely, the shape of the log_preds is (5 x 2044 x 120) . This produces error when metrics.log_loss() is calculated. The expected shape is (2044 x 120) because there are 2044 samples in the validation data set and its predicting the probabilities of 120 classes

The URL of my gist code is : https://gist.github.com/gnavink/c9baf208e12d246b53288f5181e0b695

Can anybody suggest what could be going wrong?

Kindly have a look at the output of the cells 39 & 41 in the
URL: https://gist.github.com/gnavink/c9baf208e12d246b53288f5181e0b695

After doing learn.TTA() , one needs to compute the mean to give the average of all data augmented images. After adding this line after learn.TTA(), I get expected results

thanks

In the notebook “lesson1-breeds.ipynp” just below the “Increase size” heading,
larger images ( sz = 299 ) start to be used.

learn.summary() shows the first layer to be Conv2d of input shape [ -1, 3, 224, 224 ].

How is (299 x 299) image data being fed into a (224 x224) convolution?

So far I have been unable to understand the code well enough to answer the question. thx.

Hi @jeremy ,
I am using an AWS p2.xlarge Instance.
In lesson - 2, section 7.2
when I run the
learn.fit(lr,3, cycle_len=1, cycle_mult=2) statement, It is taking longer time sometimes more than 40 minutes.
May I know what is the normal time for running this statement in lesson 2 with a GPU?

Can anyone provide an snippet explaining how is it possible to unfreeze the first layers of a model and freeze the rest? I have network that I am going to train only the first few layers of it.

This is according to me the best , simplest way to illustrate data augmentation and model inference. passing out knowledge is a gift.

Are you using google colab? If so folks have written guide on medium about it.