Regarding the instruction to read and understand the #Click me cell of Chapter 1, these are my thoughts and questions. As will be obvious, I’m a novice programmer.
from fastai2.vision.all import *
Import everything (classes, libraries, etc.) from the fastai vision library
path = untar_data(URLs.PETS)/'images'
I had a misunderstanding about this one. I thought untar_data(URLs.PETS)
was downloading the URLs of the pet images, possibly because I’m predisposed to think of downloading URLs for the classifier for Lesson 1 from v3, but also because it’s URLs plural, not URL. So I checked the docs, and it turns out there’s a URLs
class we’re using, and PETS
is one of its methods. There are similar URLs
methods for other datasets, but only the fastai ones. This approach doesn’t generalize to non-fastai datasets (but we’ll be learning other approaches that do generalize!).
So the dataset is extracted, and the location of the extracted dataset is returned to path
. But what does the /'images'
at the end do? I searched the forum and found the notes from Lesson 3 of v3, and if I’m extrapolating correctly, I think the pets dataset has a folder named ‘images’, and we’re telling the path to point specifically to that folder, rather than to the dataset folder as a whole. Is that right?
def is_cat(x): return x[0].isupper()
Define a function is_cat
to which we pass x
, the filename of each pet image. A characteristic of this particular dataset is that the first character of the filename is uppercase if the file is an image of a cat, so is_cat
returns True
if the first character of x
is uppercase, and False
otherwise.
dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))
I have questions about this one, too. The book says:
" The fourth line tells fastai what kind of dataset we have, and how it is structured. There are various different classes for different kinds of deep learning dataset and problem–here we’re using ImageDataLoaders
. The first part of the class name will generally be the type of data you have, such as image, or text. The second part will generally be the type of problem you are solving, such as classification, or regression."
What is “the second part of the class name” that is “the type of problem you are solving…”? We’re doing classification, but it’s not obvious to me where that’s declared in the class name.
Then we’re using the from_name_func
method of the ImageDataLoaders
class, which creates our DataLoaders
(dls
as we’re calling them here), setting aside 20% of our data as the validation set, setting the optional seed value to 42, setting the labelling function to be our is_cat
function defined above, and selecting the Resize(224)
as the transformation to be applied to the images, resizing them all to 224x244 pixels for historical reasons.
But why is the seed set to 42? The book and the docs say it’s for reproducibility, and I understand that getting the same validation set every time is what gives us reproducible results, but what is a seed, and how does it achieve a reproducible validation set? I Googled “reproducibility seed” and found this post helpful:
“The “seed” is a starting point for the sequence and the guarantee is that if you start from the same seed you will get the same sequence of numbers.”
But if the elements of the validation set are chosen randomly, how does starting from the same point help? And why 42? Is there a practical consideration at work, or is it just Douglas Adams?
learn = cnn_learner(dls, resnet34, metrics=error_rate)
Use the cnn (convolution neural network) learner, telling it to use the dls
we established above, the ResNet34 architecture, and the error rate as a metric. Pretty straightforward for me.
learn.fine_tune(1)
Since we’re using a pretrained model, we don’t want to start fitting the model from scratch, as we would if we used learn.fit
. Instead, we’ll fine-tune the model for our particular dataset for one epoch (a complete pass through the dataset) to create the head of our model, which is unique to this dataset. The book says:
“After calling fit
, the results after each epoch are printed, showing the epoch number, the training and validation set losses (the “measure of performance” used for training the model), and any metrics you’ve requested (error rate, in this case).”
But it must mean “After calling fine_tune
.”
I did have another hiccup, trying to use ??
to see the docs for methods, e.g. ??cnn_learner
; I keep getting an error “Object cnn_learner
not found.” Other shortcuts such as b
to create a new cell are working for me, so I’m not sure what I’m doing wrong with this one.
And that’s the lot! Thanks for reading all of this, and please let me know if you can answer any of my questions, or if I’ve mischaracterized anything.