Kaggle Comp: Plant Seedlings Classification

shubham24 · February 5, 2018, 1:18pm

I think you’re using the wrong syntax.
Try:
for image in glob(f'{PATH}train/**/*.png'):
You may want to read this post: https://cito.github.io/blog/f-strings/

ianianian · February 5, 2018, 4:28pm

@shubham24

Thank you for the fix and pointing me in the right direction to learn more.

Appreciate it!

raja4net · February 6, 2018, 12:41pm

Hi! I am getting an error whenever i run a function to calculate lean value.

learn = ConvLearner.pretrained(arch, data)
lrf=learn.lr_find()
learn.sched.plot()
Please help in resolving this error. Error log is below:

TypeError Traceback (most recent call last)
in ()
----> 1 lrf=learn.lr_find()
2 learn.sched.plot()

~/fastai/fastai/learner.py in lr_find(self, start_lr, end_lr, wds, linear)
250 layer_opt = self.get_layer_opt(start_lr, wds)
251 self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr, linear=linear)
–> 252 self.fit_gen(self.model, self.data, layer_opt, 1)
253 self.load(‘tmp’)
254

~/fastai/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, use_clr, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, **kwargs)
154 n_epoch = sum_geom(cycle_len if cycle_len else 1, cycle_mult, n_cycle)
155 return fit(model, data, n_epoch, layer_opt.opt, self.crit,
–> 156 metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, **kwargs)
157
158 def get_layer_groups(self): return self.models.get_layer_groups()

~/fastai/fastai/model.py in fit(model, data, epochs, opt, crit, metrics, callbacks, **kwargs)
104 i += 1
105
–> 106 vals = validate(stepper, data.val_dl, metrics)
107 if epoch == 0: print(layout.format(*names))
108 print_stats(epoch, [debias_loss] + vals)

~/fastai/fastai/model.py in validate(stepper, dl, metrics)
126 preds,l = stepper.evaluate(VV(x), VV(y))
127 loss.append(to_np(l))
–> 128 res.append([f(preds.data,y) for f in metrics])
129 return [np.mean(loss)] + list(np.mean(np.stack(res),0))
130

~/fastai/fastai/model.py in (.0)
126 preds,l = stepper.evaluate(VV(x), VV(y))
127 loss.append(to_np(l))
–> 128 res.append([f(preds.data,y) for f in metrics])
129 return [np.mean(loss)] + list(np.mean(np.stack(res),0))
130

~/fastai/fastai/metrics.py in (preds, targs)
11
12 def accuracy_thresh(thresh):
—> 13 return lambda preds,targs: accuracy_multi(preds, targs, thresh)
14
15 def accuracy_multi(preds, targs, thresh):

~/fastai/fastai/metrics.py in accuracy_multi(preds, targs, thresh)
14
15 def accuracy_multi(preds, targs, thresh):
—> 16 return ((preds>thresh)==targs).float().mean()
17

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/tensor.py in eq(self, other)
346
347 def eq(self, other):
–> 348 return self.eq(other)
349
350 def ne(self, other):

TypeError: eq received an invalid combination of arguments - got (torch.cuda.FloatTensor), but expected one of:

(int value)
didn’t match because some of the arguments have invalid types: (torch.cuda.FloatTensor)
(torch.cuda.ByteTensor other)
didn’t match because some of the arguments have invalid types: (torch.cuda.FloatTensor)

ecdrid · February 6, 2018, 1:02pm

It seems(not sure at all) that the error is with the lib as this error arises when the Variables that we pass into the function aren’t of the same type,

Also which OS are you running?

raja4net · February 6, 2018, 1:12pm

I am using fastai machine on paperspace.com
But I am running it from Windows 7 PC in Cygwin. Will this make a difference?
EDIT: I have run notebook on Macbook too and the result is same. Any thoughts?

rsevs3 · February 9, 2018, 6:53am

I was having this problem as well.

Turned out the issues was that I wasn’t replaces the ’ ’ in the species names.

See here for the solution:

raja4net · February 9, 2018, 7:30am

Edit: After removing spaces from species name it worked like a charm. Thanks!

alessa · February 22, 2018, 3:14pm

same question here, I was wondering the same

maybe it’s easier to play with the validation dataset - cause with from_csv you just need to give the idxs of the validation images, while with from_path you need to move files from the training folders in the validation folders. And if you want to make cross validation - this can be a tough job to move files between the folders.

gerardo · March 13, 2018, 5:46pm

@sermakarevich Now that the competition finished
Can you share some highlights of your solution or maybe the notebook?

sermakarevich · March 13, 2018, 6:06pm

I really don`t have much to share as I used only 1 notebook and it looks like a piece of … cake. All I change were image size and model. If you would like to look into messy notes, please send me your email.

alessa · March 13, 2018, 6:42pm

Could you explain what’s in the first three graphs? Is it the loss with respect to width, height?
How did you gather information to plot the graphs? Did you train the model with different sizes and save the results, and plot them at a later step?
For the second row graph, what exactly means less than, more than?
Thanks!

sermakarevich · March 13, 2018, 6:46pm

I am sorry @alessa, what graphs are you referring to? - updt: Found.

First chart is just plt.hist of images width, second one - of image height, third one - plt.scatter of width/height. To build the forth I use OOF predictions of a train set and calculate the score by selecting only images that satisfy criteria: < than some size or > than some size.

alessa · March 13, 2018, 6:47pm

these ones

digitalspecialists · March 13, 2018, 8:34pm

I was able to do well with success down to 1. emsembling what I found to be the strongest performing architectures (resnet50 and nasnet), 2. spending time fine tuning hyper parameters and image sizes, 3. running k-fold cross validations, and more than once. I think these are good steps for any serious attempt at any leaderboard climbing on any similar competition, and was a good starter learning experience. The competition has closed, but remains a good one to practise these skills.

SHAR1 · March 14, 2018, 1:55am

Did you come across an error while running nasnet? I faced size error.

bdekoven · March 14, 2018, 1:02pm

@digitalspecialists could you share the code for how you performed the cross validations? Thank you!

SHAR1 · March 14, 2018, 7:52pm

This is brilliant!

bdekoven · March 14, 2018, 7:59pm

@SHAR1 thank you so much for this information!

sermakarevich · March 14, 2018, 8:28pm

Please be careful:

SHAR1 · March 15, 2018, 6:05am

Here is the notebook snipet for my first attempt at this competition. I think its a good place to start.
Just vanilla fastai tips. No cross-validation, ensemble, segmentation of any sorts. I have just kept an eye on the losses, nothing more. I haven’t added any documentation, cause, I followed jermey’s tips nothing more. If you need explanation. Just ping me, I’ll add up.

0.988 accuracy. Around 0.97 in public leaderboard.

gist.github.com

https://gist.github.com/Sharwon/8da9c814b52920724c113dd38d42aff3

gistfile1.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [

This file has been truncated. show original

simple_plant_seedlings.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [

This file has been truncated. show original