This is mostly a language question. To find the bounding boxes, pascal.ipynb uses an ImageClassfierData with continuous=True.
Um, I thought that if you had discrete values, the term was “Classification” and if the values were continuous, the term was “Regression”. Am I missing a subtlety here?
(I went looking through dataset.py and couldn’t find ImageRegressionFoo, so had to go look at pascal.ipynb, which I was trying not to do.)
I have dealt with json data loading/preparation a bit differently compared to the lesson. Also, some in-notebook variable and function names are updated to help my understanding. (I got lazy towards the end though)
Also, I’m a bit curious as to where the number 25088 comes from in the custom head linear layer addition.
nn.Sequential(Flatten(), nn.Linear(25088,4))
I’m assuming that one would first inspect the architecture, if was based on a standard architecture, or know this before hand because they assembled it. Is there any other intuition I’m missing ?
There was a reference to using the fast-ai CPU version to browse the source code on the local machine. Can someone please post the method to do the same? thanks!
This number is 512 * 7 * 7 , from the last laters of the model architecture, before flattening. Instead of pooling we have flatten layer with output vector 25088, and that is multiplied by 4 for each activation.
Still trying to better imagine how the 512 7 7 values map to linear 25088. Not doing pooling makes sense, I guess this makes sure relative positions of the bbox is not lost in the model.
Sure. I can see that when inspecting with learn.summary(). I was only wondering if there was anything more to it, other than fully connecting the layers to the desired number of outputs.
A simple git clone locally should be good enough for browsing the fastai source. You’ll only need to install other libs locally, if you want to jump/explore further into the source of libraries that fastai depends on.
As @suvash said, if you just want to browse the code on your GPU-challenged machine, you can just do a git clone. If you want to actually run code slowly on a CPU, do this:
I think we are just being sneaky here and using ImageClassifierData to give us dataloaders that are different from what this class was originally intended to do
When you call it with ImageClassifierData.from_paths it does what it says it would do - return dataloaders fetching images with their corresponding classes.
ImageClassifierData.from_csv could be used in similar fashion (where the 2nd column of the read in csv would be the class id), but we sort of take all of this machinery and feed it a doctored column (a string of coords that gets split called on it deep in the internals and instead of getting class indexes we get our 4 continuous floats per image).
Not trying to comment on the naming just thought this background info might be of help to you.
Setting the continuous parameter to True which turns on the is_reg property that’s used later in the model when training etc…
Like @radek mentioned, the labels from the space separated second column is extracted as follows.
from_csv
https://github.com/fastai/fastai/blob/master/fastai/dataset.py#L422-L423
↓
csv_source
https://github.com/fastai/fastai/blob/master/fastai/dataset.py#L447
↓
parse_csv_labels
https://github.com/fastai/fastai/blob/master/fastai/dataset.py#L136
↓
def parse_csv_labels
https://github.com/fastai/fastai/blob/master/fastai/dataset.py#L103-L128
I haven't bothered inspecting it properly, but I think this is where the extraction happens
> df.iloc[:,0].str.split(cat_separator)
https://github.com/fastai/fastai/blob/master/fastai/dataset.py#L127
To continue playing detective and continue to learn about the library, I would like to add that I think is_reg is only used in the automated construction of the adaptive pooling head and for the purposes of automatically picking the cost function:
11 def get_fc_layers(self):
10 res=[]
9 ni=self.nf
8 for i,nf in enumerate(self.xtra_fc):
7 res += self.create_fc_layer(ni, nf, p=self.ps[i], actn=nn.ReLU())
6 ni=nf
5 final_actn = nn.Sigmoid() if self.is_multi else nn.LogSoftmax()
4 if self.is_reg: final_actn = None
3 res += self.create_fc_layer(ni, self.c, p=self.ps[-1], actn=final_actn)
2 return res
1
class ConvLearner(Learner):
5 def __init__(self, data, models, precompute=False, **kwargs):
4 self.precompute = False
3 super().__init__(data, models, **kwargs)
2 if hasattr(data, 'is_multi'):
1 self.crit = F.binary_cross_entropy if data.is_multi else F.nll_loss
90 if data.is_reg: self.crit = F.l1_loss
1 elif self.metrics is None:
2 self.metrics = [accuracy_thresh(0.5)] if self.data.is_multi else [accuracy]
3 if precompute: self.save_fc1()
4 self.freeze()
5 self.precompute = precompute
So it is the data object that dictates what model we will get out of ConvLearner.pretrained which in turn will call ConvnetBuilder, and what cost function will the constructor for the ConvLearner automatically pick for us.
I was trying to use the visual studio code on Mac using a git clone of fast. I also created the fastai-cpu environment as well. But after selecting folder (fastai folder) and interpreter (fastai-cpu), I was trying to find ‘open_image’ using Command+T in Mac. This opens up in visual code studio what is called "Go To Symbol in workspace’. But typing "open_image’ here shows no symbols found. But using Shift+Command+F (Find in all files) works but it shows all files that has open_image in it. Am I missing something in Visual Studio Code or is it just the way it is in Mac?
Yeah this is the kind of refactoring and cleaning up which will happen later - but in part 2 of the course you get to see the “in process” development work before that kind of cleaning up happens.
(By the time this gets to the MOOC stage it’s likely that this particular issue will be fixed by moving stuff into a parent or sibling class.)