Maybe just looked into a folder?
Can you specify the source code/file for me, please? Am I looking at the right place (see link below)?
https://sourcegraph.com/github.com/fastai/fastai@master/-/tree/fastai/models
Adam or optimizers with adaptive learning takes in a predefined learning rate and adapts itself to find the minima by adjusting the learning rate in a specific way. Responsibility is still on our shoulders to find a reasonable learning rate to feed into the algorithm or it will either diverge and never reach minima(in case of very fast learning rate) or reach there very slowly expending a lot of computational power (slow learning rate). An approach that is commonly followed in these cases is to manually change different learning rates(most often starting from a faster learning rate and progress down) depending on feedback from the loss curve. This approach described by Jeremy is much simpler in execution wise and saves a lot of computation/human time while keeping it very simple. So yes, we are following a novel and simple approach to find an optimal learning rate to feed into our optimizer (It can be any optimizer of your choice and not limited to Adam).
@jeremy One doubt that I have is - During the training time, are we still using Adam as an optimizer or are we using Stochastic Gradient Descent with restarts or is it Adam with restarts and does the library have the ability to apply restarts to any optimizing algorithm?
First time when it is run, network takes in the pre - trained weights from models/weights folder (if you are not using fastai ami, you need to download those weights from http://files.fast.ai/models/weights.tgz) and keep it in that folder and compute activations for your dataset using those weights. I guess, time delay is because of this.
def lr_find(self, start_lr=1e-5, end_lr=10, wds=None):
self.save('tmp')
...
self.load('tmp')
Weāre computing that activations of the penultimate layer for our dogs v cats dataset - thatās what takes time. (We donāt download activations from the internet - they are calculated; we download weights from the internet).
Thanks!
No, itās just how pytorch (and therefore fastai) works. Thereās no need to one-hot encode labels if you have just one label per image. (In general, you shouldnāt expect any overlap in details of the software libraries between v1 and v2 of the course - they are totally different.)
Strictly speaking, I should say weāre jumping out of saddle points - that is, areas which are quite flat and are minima in at least some dimensions. In these areas, training becomes very slow, so it can have a similar impact to being in a local minima.
Weāll be learning about that later.
When I looked in the folder with ls
I saw that all the filenames ended in ā.jpgā. Also, thatās standard for nearly all photo image files.
Best to think of Adam as a type of SGD. SGD with restarts can be applied to pretty much any SGD variant, including Adam. So yes, weāre adding SGDR on top of Adam.
link not working
Here is an example from fish.ipynb notebook (inside dl1 directory). In the case of the this Kaggle competition the important part of the image is a fish. The fish were often not in the middle of the image so the cropped image missed the most important information.
sz = 350
tfms = tfms_from_model(resnet34, sz, crop_type=CropType.NO)
data = ImageClassifierData.from_csv(PATH, "images", csv_fname, bs, tfms, val_idxs)
thanks @Moody for bringing this up.
@yinterian could you clarify if when we set sz
that the default resize function is a center crop? And to avoid this we can just follow your provided code passing in crop_type=CropType.NO
? Or is this only for augmentation/transforms?
I have a related question on sz
parameter to tfms_from_model
. Can we provide a Height x Width like (400 x 250) size or does it need to be a Square. Digging into the code little bit, it looks like it only expects one value for sz
parameter and looks to be resizing to the square image?
Anyways we can keep size input as int
or a tuple of h, w
? or was that found to be not that useful?
Based on what @moody mentioned, when cropping is used to convert the images to a square shape and data augmentation is used, does 1) cropping happen first and then augmentation or 2) data augmentation of the original image and then cropping? Seems like option 2 may retain more data, but option 1 is more computationally efficient
It needs to be square. Itās a limitation of how GPUs are programmed nowadays, but I suspect at some point people will handle rectangular inputs too. For now, no library can do this (AFAIK).
Data augmentation happens before cropping, for the reason you mention (retains more data).
Okā¦The reason I ask is, my images are stock photos that are aspect ration 1.5 to 1 on Height to width. I will try the square image. But in Torch Transforms, looks like it does take rectangular values - http://pytorch.org/docs/master/torchvision/transforms.html#torchvision.transforms.Resize
I can create a pull request if you think it might be a useful feature to add.
Iām also interested in having this feature as long as its technically feasibleā¦in my case Iām working with images that are ratio 2:1
fwiw, I know some people in the recently completed Carvana Kaggle competition seemed to be using varying rectangular shaped images since that dataset contained images that were all ratio 1.5:1