Lesson 3 In-Class Discussion

learn.TTA ~ learn.predict with n_aug augmentations and a mean of the results, as you rightly mentioned. The following three lines show this difference.

Edit: As @jeremy mentioned, targets are also returned!

3 Likes

I wonder we mention preds, y for test set also. Guess, y will be empty set since there won’t be any labels?

Mandatory validation set?

I was trying to improve my model by moving all files from the validation set to the training set.


So, should I copy instead of moving files?
If I’m not going to validate (via learn.TTA()) but improve my model by giving it more images, does copying impact it in any way since model will have redundant images?

Is it possible to save the outcome of learn.TTA() for future usage? like we save model e.g. learn.save(‘somefile’)

Got it. It’s zero!
image

Or, all we’ve is cats! :unamused: Biased model :smiley:

I always use this function to save my prediction arrays to bcolz (it works for any array)

def save_array(fname, arr): c=bcolz.carray(arr, rootdir=fname, mode='w'); c.flush()
Example:
save_array('preds.bc', preds)

6 Likes

If anybody is interested in more image processing / computer vision related problems: here are some from www.crowdai.org -
https://www.crowdai.org/challenges?challenge_filter=active

Didn’t want to create a separate thread for this so posting here…

If we train with image size defined by sz are the test images also made this sz or do they retain their original size? I could see this possibly affecting predictions if for example I trained with a reduced size of 224 and then predicted on test images that were sized 400 or something.

Thanks for sharing. curious if this is how you go about saving predictions from various models, for averaging them later ?

Thank you. I found the counterpart of it in utils.py of part1v1.

def load_array(fname):
return bcolz.open(fname)[:]

1 Like

Yep thats usually what I save them for :slight_smile: Alternatively you can also just save predictions to a csv then read them back in with pandas and pretty easily avg them that way too.

2 Likes

I think I’ve heard someone asking a question on averaging in the past. Is it supposed to improve the prediction accuracy or reduce loss?

Not sure exactly on those metrics, but I think averaging over various models helps generalize the final resulting model better.
My understanding is that each model 'learn’s different details in the training data, and the best way to bring all of them together, while also preventing the influence of one specific model, is to average them.

I’m pretty sure @jamesrequa can answer this much better than me. Also, please correct me if I’m wrong here.

That’s interesting. If I were to bring human analogy here, all of us may be biased in some sense but merging all of our experience can bring a super-human who’s perfect than average humans! :stuck_out_tongue:

I had by mistake typed 3 cycles instead of 1. While each cycle takes 30 minutes, I’m wondering If I can interrupt Kernel and I expect the learn to retain the knowledge from 1st epoch.
image

Yes, you are right. It’s similar to an ensemble effect. Similar to the effect; each tree in a random forest brings to the overall model. You could build multiple NN models, which would have had different individual weights to start with (due to randomness) and when you average all of them, the averaged model kinds of smooths out the tiny errors caused by each model and most of the time you see improved losses / better accuracies.

1 Like

You can interrupt the kernel anytime by clicking this icon on the menu bar of the notebook.

Re-initialize the learner to be on the safer side and you should be good to go!

.

Double checking… we average the log_preds from various models?

I usually convert the logs to probabilities before taking the average.
probs = np.exp(log_preds)

Then if you had a few model probabilities:
avg = (prob1 + prob2 + prob3) / 3

2 Likes

Great! I found 41 variations between resnet50 and resnext50!
~0.0011 improvements in log loss! Thank you! :slight_smile:

image

1 Like