Early draft notebooks available

ramesh · March 19, 2018, 4:21pm

@bdekoven - Did you run the script as sh pascal_download.sh once you downloaded the file into fastai/courses/dl2/data directory? Please note that you will need to create the data directory yourself. The script does not do that. It downloads pascal dataset inside the data directory.

bdekoven · March 19, 2018, 6:04pm

@ramesh thank you for the feedback. So I started over and performed the following:

created directory fastai/courses/dl2/data
placed pascal_download.sh into this folder
then command “sh pascal_download.sh” and here is what the data folder contains:

Please note I am still getting the error:

Thank you,
Ben

ramesh · March 19, 2018, 6:08pm

I think your Path definition might be incorect - Either provide full path from ~/fastai/courses..../data or just provide relative location from where the notebook is (dl2 folder) as Path('data/pascal').

bdekoven · March 19, 2018, 6:13pm

You are correct. Thank you!

daveluo · March 19, 2018, 7:17pm

UPDATE 3/20: The below workaround should no longer be needed after Jeremy’s update on 3/19 (git pull for latest). Preserving contents for discussion history’s sake.

Hi all,

Thanks for the discussion and diagnosis so far on IndexError: index 383 is out of bounds for axis 0 with size 298.

Deleting tfm_y=TfmType.COORD (or setting it to =TfmType.NO which does the same thing) in tfms_from_model() does remove the error and allow me to continue with the notebook but I agree with @belskikh that the training/fitting is performing worse because the bbox coordinate labels are no longer correct for the train/test images (which have been transformed to 224x224px).

I haven’t had time to find the exact bug in the library but I made a small work-around which correctly scales the bbox coordinates (assuming the training images are transformed to 224x224).

Instead of running the 1st four cells under Bbox only in the original pascal.ipynb:

Run this combined cell instead in the same location:

BB_CSV = PATH/'tmp/bb_scaled.csv'

f_model=resnet34
sz=224
bs=64

def scale_bb(img_h, img_w, bb, targ):

    h_ratio = targ/img_h # height (y-axis) scale factor
    w_ratio = targ/img_w # width (x-axis) scale factor
    
    # scale each bb coordinate respectively in order of [y_upperleft, x_upperleft, y_lowerright, x_lowerright]
    bb_scaled = np.asarray([bb[0]*h_ratio, bb[1]*w_ratio, bb[2]*h_ratio, bb[3]*w_ratio]).astype(int)
    
    return bb_scaled

bb = np.array([trn_lrg_anno[o][0] for o in trn_ids])
bb_scaled = [scale_bb(img['height'], img['width'], bb, sz) for img, bb in zip(trn_j['images'], bb)]
bbs = [' '.join(str(p) for p in o) for o in bb_scaled]

df = pd.DataFrame({'fn': [trn_fns[o] for o in trn_ids], 'bbox': bbs}, columns=['fn','bbox'])
df.to_csv(BB_CSV, index=False)

tfms = tfms_from_model(f_model, sz, crop_type=CropType.NO, tfm_y=TfmType.NO)
md = ImageClassifierData.from_csv(PATH, JPEGS, BB_CSV, tfms=tfms, continuous=True)

What this does is calculate and create a new bb_scaled.csv file which stores the correctly scaled bounding box coordinates for transformed training and test images of size 224x224. Note that we keep tfm_y=TfmType.NO because this manually transformed the bb coordinates labels.

With this workaround, I’m able to predict the correct bounding boxes and replicate the loss scores in Jeremy’s notebook much more closely:

I assume/hope the library bug will be fixed soon anyways so apology if this is distracting from that but thought it was worth sharing as real-time troubleshooting and exploration of how the library works.

gai · March 19, 2018, 7:27pm

Is the pascal notebook an implementation of YOLO/YOLOv2 or something entirely different?

pandeyanil · March 19, 2018, 8:03pm

notebook - pascal-multi
Error - RuntimeError: Expected object of type Variable[torch.cuda.FloatTensor] but found type Variable[torch.cuda.LongTensor] for argument #1 ‘target’

Multi Class
Step - lrf=learn.lr_find(1e-5,100)

RuntimeError Traceback (most recent call last)
in ()
----> 1 lrf=learn.lr_find(1e-5,100)

/mnt/data/ssd000/dsb2017/anil/fastai/courses/dl2/fastai/learner.py in lr_find(self, start_lr, end_lr, wds, linear)
256 layer_opt = self.get_layer_opt(start_lr, wds)
257 self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr, linear=linear)
–> 258 self.fit_gen(self.model, self.data, layer_opt, 1)
259 self.load(‘tmp’)
260

/mnt/data/ssd000/dsb2017/anil/fastai/courses/dl2/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, **kwargs)
160 n_epoch = sum_geom(cycle_len if cycle_len else 1, cycle_mult, n_cycle)
161 return fit(model, data, n_epoch, layer_opt.opt, self.crit,
–> 162 metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, **kwargs)
163
164 def get_layer_groups(self): return self.models.get_layer_groups()

/mnt/data/ssd000/dsb2017/anil/fastai/courses/dl2/fastai/model.py in fit(model, data, epochs, opt, crit, metrics, callbacks, stepper, kwargs)
94 batch_num += 1
95 for cb in callbacks: cb.on_batch_begin()
—> 96 loss = stepper.step(V(x),V(y), epoch)
97 avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
98 debias_loss = avg_loss / (1 - avg_mombatch_num)

/mnt/data/ssd000/dsb2017/anil/fastai/courses/dl2/fastai/model.py in step(self, xs, y, epoch)
41 if isinstance(output,tuple): output,*xtra = output
42 self.opt.zero_grad()
—> 43 loss = raw_loss = self.crit(output, y)
44 if self.reg_fn: loss = self.reg_fn(output, xtra, raw_loss)
45 loss.backward()

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in binary_cross_entropy(input, target, weight, size_average)
1177 weight = Variable(weight)
1178
-> 1179 return torch._C._nn.binary_cross_entropy(input, target, weight, size_average)
1180
1181

RuntimeError: Expected object of type Variable[torch.cuda.FloatTensor] but found type Variable[torch.cuda.LongTensor] for argument #1 ‘target’

gai · March 19, 2018, 8:19pm

To answer my own question: It doesn’t seem like YOLO. The pictures are not segmented into 13x13 cells like proposed in the YOLO papers. Rather the model seems to be getting the bounding boxes, images and labels from the pascal dataset and use those to predict what’s in the picture.

gai · March 19, 2018, 8:22pm

I also get a similar error with pascal-multi but pascal.ipynb works with the patches mentioned above.

derekja · March 19, 2018, 9:03pm

Gai and Pandeyanil, are you on Windows? See the thread if so: https://github.com/fastai/fastai/issues/71

(I am also seeing this on Win10)

pandeyanil · March 19, 2018, 9:06pm

i am on ubuntu 16.04

derekja · March 19, 2018, 9:13pm

rats, so much for that theory! Thanks. Hunting down another path now on the same error…

gai · March 19, 2018, 9:19pm

Ubuntu 16.04.3 LTS and I get the same error as @pandeyanil. I am not really sure what the difference is to the non -multi version, though, maybe that was an earlier draft and the one without -multi is the current one

jeremy · March 19, 2018, 10:03pm

Both the index and long bugs should be fixed in git now.

derekja · March 19, 2018, 11:03pm

Thanks, Jeremy!

just a typo at the line “im0_a = im_a[0]; im0_a” (you had one of the variables as im_0a)

tyoc213 · March 20, 2018, 3:35pm

Hi there, I downloaded both commits before the class and push them to my repo, and hit “restart and run all” I was getting the boxes/labels out of place, will try again in the course of the day.

dsantoshb · March 25, 2018, 2:33am

Hi @jsonm

How did you update fastai library?

jsonm · March 25, 2018, 2:50am

Depends how you set up your environment.

Make sure the repo is latest (git pull)

If you’re using conda, conda env update.

This is what I’d recommend.

jeremy · March 25, 2018, 3:15am

You should never use the notebooks this way.

tyoc213 · March 25, 2018, 6:10pm

I see, I just thought it could be the easy way to see if the whole file worked (printed the correct bboxes and so on)