Code reading

Sourcegraph integrates seamlessly into the UI, and even I had problem understanding how to invoke it. Make sure the extension is enabled in chrome://extensions/

Check the following

  1. Do you see this icon called View Repository when you are in a github repository? On clicking this, it takes you to sourcegraph’s portal to view the repo

https://sourcegraph.com/github.com/fastai/fastai@master

sg1

  1. While browsing code in github, click on a function call. Do you then see this?
    sg2

2 Likes

I could do step 1 but a bit confused about step 2.

You mentioned that while browsing code in github, click on function call… This is doesn’t work for me. The git is doesn’t provide option…

I tried clicking on fit function call.

I guess it does not work with Jupiter notebooks. Try on .py files.

1 Like

You need to access Sourcegraph through the View Repository link, as mentioned above by @anandsaha in Step 1:

image

.ipynb files are internally JSON files, and I don’t think there’s much information you can extract out of viewing those files with Sourcegraph.

Once you do that step 1, you can visit the fastai library and explore what’s used where in the .py files (as mentioned in Step 2 above).

Hope this helps! :slight_smile:

1 Like

Apologies for the late reply.
We had a flood alert in my city and there was no power for a few days. Sorry I couldn’t help.

Hope everything is okay!

1 Like

Take care and no worries.

1 Like

Is there an option to ask the model take 20% of the data in a given folder (say the train folder itself) as validation data while using ImageClassifierData.from_paths() function?

I know the ImageClassifierData.from_csv() supports the validation indices functionality but it would be awesome and make life a lot simpler if we have that option in the ImageClassifierData.from_paths() function too. This would enable to quickly put together a folder of images and start building a model.

Currently, as a work-around I am using another small python script using glob to move random 20% of images to a separate valid folder.

1 Like

No there’s currently no such option. But if anyone is interested in adding val_idx param to this function, I’d be happy to merge it :slight_smile: . Of course it shouldn’t be possible to set both the val folder and val_idx

tfms stands for transformations :slight_smile:

1 Like

Sometimes when you let the model run on AWS for a long time, one might disconnect from the ssh session.
In a case like that even if we connect back to the Notebook server (which is running inside tmux) I cannot see the progress on each iteration (losses and metrics).

Does the learn object retain a history of the metrics?
Is there a way to retrieve the historical loss values from the learn/model object?

Sure is! It’s inside learn.sched. E.g. plot_loss().

2 Likes

hmm this looks like it plots and keeps track of only the training loss. Anywhere I can specify to track validation loss and validation accuracies?

Good point. No we’re not storing that anywhere at the moment.

@jeremy in CosAnneal#calc_lr in the if statement we are first making the learning rate very small for first 20 batches?

class CosAnneal(LR_Updater):
    def __init__(self, layer_opt, nb, on_cycle_end=None, cycle_mult=1):
        self.nb,self.on_cycle_end,self.cycle_mult = nb,on_cycle_end,cycle_mult
        super().__init__(layer_opt)

    def on_train_begin(self):
        self.cycle_iter,self.cycle_count=0,0
        super().on_train_begin()

    def calc_lr(self, init_lrs):
        if self.iteration<self.nb/20:
            self.cycle_iter += 1
            return init_lrs/100.

This is the tiny flat line on this graph starting at iterations = 0?

image

This is the good ol’ trick from part 1 v1 of first training with super small lr to get out of finding easy but sucky optima and then only increasing the learning rate to what we would like to use for training?

Just wanted to confirm I am reading this right and not going crazy :slight_smile: Though not sure if reading this right actually precludes the second part of the statement from being true :slight_smile:

In Learner#TTA we have these lines:

        preds1 = [preds1]*math.ceil(n_aug/4)
        preds2 = [predict_with_targs(self.model, dl2)[0] for i in tqdm(range(n_aug), leave=False)] 
        return np.stack(preds1+preds2).mean(0), targs

The first line seems to be keeping the proportion of augmented images to be at most 80% if I am reading this right? I would assume this is again one of the small best practices that fastai gives us out of the box?

1 Like

I confirm you’re reading it right, but have no opinion on whether you’re going crazy. For that, see a professional.

2 Likes

Right, although it’s not one I’ve tested that rigorously, frankly.

1 Like

Just wondering…Is there a way to plot loss wrt learning rate in the case of gradient descent with restarts?

I am training resnet for a binary image classification task using our standard approach on a custom data of 200 images in every class. So in total 800 images (400 training, 200 per class and 400 validation, 200 per class).

I am using both learn.predict and learn.TTA. I have observed that the probabilities (sum of the class probabilities) do not sum to 1 (not even close to 1 in some cases) when I use learn.TTA(). Is this a bug?

2 Likes