Dog Breed Identification challenge

jamesrequa · November 21, 2017, 5:54pm

How long did it take to train?

sermakarevich · November 21, 2017, 6:04pm

Don`t remember exactly but something like ~1-2 hours to precompute activations, and than with this approach + 30-60 minutes for 5 folds cross validation.

jamesrequa · November 21, 2017, 6:05pm

Ohh smart…so you precomputed activations for each of the folds in your 5 fold cv?

Good thing for dog breed we aren’t finetuning…I bet it would take forever to run it with learn.unfreeze()

lgvaz · November 21, 2017, 6:08pm

And you got this result without data aug?

sermakarevich · November 21, 2017, 6:10pm

I precomputed them once: for train (train - 1 image), validation [1 image] and test. And than just changed indexes and never precompute activations once again.

In dogs vs cats competitions Bojan (the winner) told his best model was training ~ 1 week.

UPDTL @lgvaz it is a weighted average of resnet101_64, inception_v4 and nasnet. with each model i predicted with 5 fold cv with 5 different seeds (75 models in total).

jamesrequa · November 21, 2017, 6:16pm

OK I think I got it now and you were able to do this by following your steps you posted in this thread linked below right?

So basically you precomputed activations on all of the data (except one image) and then you just changed the indexes to split up train/validation sets…right? Although you had to create a custom function to be able to do this in fastai it seems

sermakarevich · November 21, 2017, 6:17pm

Almost exactly right, except for this:

I joined it back to train

data_x = np.vstack([val_act, act])

jamesrequa · November 21, 2017, 6:18pm

Oh I missed that haha, amazing, good work!!

lgvaz · November 21, 2017, 6:22pm

That’s really a lot of models… haha

I saw you talking in another topic about essembling methods, I tried to use logistic regression on top of my classifiers but it went really really bad… Anyways, you are calculating the weights based on the CV loss?

sermakarevich · November 21, 2017, 6:24pm

Based on loss on my train set.

suvash · November 21, 2017, 6:27pm

Super. Was planning to use this for the next batch of uninterrupted time I get. Were you able to use Nasnet right of the example shared by Jeremy, or any gotchas there ?

sermakarevich · November 21, 2017, 6:29pm

From Jeremy`s example. It should be as simple as with any other model. Just a bit longer

bushaev · November 21, 2017, 7:22pm

Hi! So has anyone tried not removing last fc layer ?
I tried it with resnet34 and Inception_4, but the results aren’t so great and I have a couple of questions.

resnet34 gives 1000 probabilities for each example which is what i would expect, however, inception_4 gives 1536 of those, why is that ?
resnet34 gave horrible results really, with inception_4 I got .93 accuracy and 0.23 log_loss on validation set which is pretty good but does not really compare to result gotten from people in top 10. What can I do to improve results ?
Some of my thoughts as well.
I guess i can combine predictions from different models to get better results(could it help ?)
I trained logistic regression on top of predictions, would MLP work better ?

connelly · November 21, 2017, 7:28pm

I am trying to do a last step fit with the entire training set before having the model predict the test set. I tried to simply not setting val_idxs when calling from_csv but when I call get_data I get the below error.

I see in from_csv definition the default value is None. I have tried passing an empty numpy array and I get the same error.

IndexError                                Traceback (most recent call last)
<ipython-input-38-464872a660af> in <module>()
----> 1 data = get_data(sz, bs)

<ipython-input-37-7d8c25715bc7> in get_data(sz, bs)
      1 def get_data(sz, bs):
      2     tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
----> 3     data = ImageClassifierData.from_csv(PATH, 'train' ,f'{PATH}labels.csv', test_name='test', val_idxs=[], suffix='.jpg', tfms=tfms, bs=bs)
      4     return data if sz > 300 else data.resize(340, 'tmp')

~/fastai/courses/dl1/fastai/dataset.py in from_csv(cls, path, folder, csv_fname, bs, tfms, val_idxs, suffix, test_name, continuous, skip_header, num_workers)
    348         """
    349         fnames,y,classes = csv_source(folder, csv_fname, skip_header, suffix, continuous=continuous)
--> 350         ((val_fnames,trn_fnames),(val_y,trn_y)) = split_by_idx(val_idxs, np.array(fnames), y)
    351 
    352         test_fnames = read_dir(path, test_name) if test_name else None

~/fastai/courses/dl1/fastai/dataset.py in split_by_idx(idxs, *a)
    361 def split_by_idx(idxs, *a):
    362     mask = np.zeros(len(a[0]),dtype=bool)
--> 363     mask[np.array(idxs)] = True
    364     return [(o[mask],o[~mask]) for o in a]

IndexError: arrays used as indices must be of integer (or boolean) type

bushaev · November 21, 2017, 7:30pm

You can use val_idxs=[0] to get rid of that error and have only one image in your validation set

metachi · November 21, 2017, 7:35pm

Looks like inception_4 is doing something different…

def inception_4(pre):
    return children(load_pre(pre, InceptionV4, 'inceptionv4-97ef9c30'))[0]

instead of…

def resnext50(pre): return load_pre(pre, resnext_50_32x4d, 'resnext_50_32x4d')

I changed inception_4 to:
def inception_test(pre): return load_pre(pre, InceptionV4, 'inceptionv4-97ef9c30')
and got 1001 outputs

A_TF57 · November 21, 2017, 7:37pm

I started with resnet34 and realized it gave me really bad results. I then moved to resnext101_64 which was better (around 94% accuracy and close to 0.25 log loss).

Later, I switched to inception_v4 and after training for some time, experimenting with different sizes, the accuracy jumped to ~95% and the log loss dropped to 0.18.

I took average of the probabilities of these 2 models and submitted, with a final log loss of 0.159! So, ensembling, or in this case simple averaging helped me a lot.

Would love to build on this further and try nasnet next.

bushaev · November 21, 2017, 8:07pm

I tying to get predictions from pretrained resnext101_64 model, but i’m getting a weird error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-17-3bcdb70995d3> in <module>()
----> 1 resnext_preds = get_model_predictions(resnext101_64, 'resnet')

<ipython-input-14-2f9714fa7f7d> in get_model_predictions(f_model, name)
      6     bm = BasicModel(m.cuda(), name=name)
      7     learn = ConvLearner(data, bm)
----> 8     predictions, y = learn.TTA()
      9     save_array('predictions_' + name, predictions)
     10     return predictions, y

~/fastai/fastai/learner.py in TTA(self, n_aug, is_test)
    167         dl1 = self.data.test_dl     if is_test else self.data.val_dl
    168         dl2 = self.data.test_aug_dl if is_test else self.data.aug_dl
--> 169         preds1,targs = predict_with_targs(self.model, dl1)
    170         preds1 = [preds1]*math.ceil(n_aug/4)
    171         preds2 = [predict_with_targs(self.model, dl2)[0] for i in tqdm(range(n_aug), leave=False)]

~/fastai/fastai/model.py in predict_with_targs(m, dl)
    115     if hasattr(m, 'reset'): m.reset()
    116     res = []
--> 117     for *x,y in iter(dl): res.append([get_prediction(m(*VV(x))),y])
    118     preda,targa = zip(*res)
    119     return to_np(torch.cat(preda)), to_np(torch.cat(targa))

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     65     def forward(self, input):
     66         for module in self._modules.values():
---> 67             input = module(input)
     68         return input
     69 

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     65     def forward(self, input):
     66         for module in self._modules.values():
---> 67             input = module(input)
     68         return input
     69 

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/linear.py in forward(self, input)
     51 
     52     def forward(self, input):
---> 53         return F.linear(input, self.weight, self.bias)
     54 
     55     def __repr__(self):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in linear(input, weight, bias)
    551     if input.dim() == 2 and bias is not None:
    552         # fused op is marginally faster
--> 553         return torch.addmm(bias, input, weight.t())
    554 
    555     output = input.matmul(weight.t())

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/variable.py in addmm(cls, *args)
    922         @classmethod
    923         def addmm(cls, *args):
--> 924             return cls._blas(Addmm, args, False)
    925 
    926         @classmethod

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/variable.py in _blas(cls, args, inplace)
    918             else:
    919                 tensors = args
--> 920             return cls.apply(*(tensors + (alpha, beta, inplace)))
    921 
    922         @classmethod

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/_functions/blas.py in forward(ctx, add_matrix, matrix1, matrix2, alpha, beta, inplace)
     24         output = _get_output(ctx, add_matrix, inplace=inplace)
     25         return torch.addmm(alpha, add_matrix, beta,
---> 26                            matrix1, matrix2, out=output)
     27 
     28     @staticmethod

RuntimeError: size mismatch at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THC/generic/THCTensorMathBlas.cu:243

here’s the code I’m using

def get_model_predictions(f_model, name):
    tfms = tfms_from_model(f_model, sz)
    data = ImageClassifierData.from_csv(PATH, 'train', f'{PATH}/labels.csv',bs=bs, tfms=tfms, val_idxs=idx, test_name='test', 
                                        suffix='.jpg')
    m = f_model(True)
    bm = BasicModel(m.cuda(), name=name)
    learn = ConvLearner(data, bm)
    predictions, y = learn.TTA()
    save_array('predictions_' + name, predictions)
    return predictions, y

this works for resnet34, inception_4, resnext50 but not for resnext101_64 or resnext101
I tried removing tmp folder but it doesn’t help

bushaev · November 23, 2017, 12:08am

Is everyone with 0.1 loss and less just have been using the whole Stanford Dogs Dataset for training ?
Is this considered cheating ?

pierreguillou · November 23, 2017, 12:24pm

Hello,

through Data Augmentation, I’m wondering what happens to images with a row and/or a column size inferior to 299 when we set sz=299 ?

Why ? because in the train set of the Dog Breed competition, 10% of images have a row size < 299.

To watch a SPECIFIC image AFTER data augmentation, I’m adapting the following code from the “Dogs vs Cats” jupyter notebook :

sz = 299
bs= 64

tfms = tfms_from_model(arch, sz, aug_tfms=transforms_side_on, max_zoom=1.1)

def get_augs():
    data = ImageClassifierData.from_csv(PATH, 'train', label_csv, test_name='test',
                                    val_idxs=val_idxs, suffix='.jpg', tfms=tfms, bs=bs)
    x,_ = next(iter(data.aug_dl))
    return data.trn_ds.denorm(x)[1]

ims = np.stack([get_augs() for i in range(6)])

plots(ims, rows=2)

This code works well but how to use it against a specific image ?