Part 2 Lesson 9 wiki

(Even Oldridge) #473

I’m trying to understand the implementation of focal loss, partly so I can port it to other areas. I’ve seen a few implementations online, but they all implement from scratch and the lesson 9 notebook it strikes me as particularly efficient given that it uses the weight parameter of the binary cross entropy function. I’d like to do the same for CE.

I want to confirm my understanding though that for focal loss we’re multiplying the cross entropy loss (-log(pt)) with a*(1-pt)^gamma and we can do that by setting weight in the F.cross_entropy_loss to that precalculated value.

def get_weight(self,x,t):
alpha,gamma = 0.25,1
p = x.sigmoid()
pt = p * t + (1-p) * (1-t)
w = alpha * t + (1-alpha) * (1-t)
return w * (1-pt).pow(gamma)

Similarly, I want to confirm we’re using sigmoid here because we’re doing BCE? And if I wanted to apply this to CE then the appropriate function would be .softmax() but the function would otherwise be identical?


(Behzad Mehmood) #474

I am using ‘Kaggle’ to run this code. But I am unable to use ‘ImageClassifierData.from_csv’ because I m unable to create a new ‘.csv’ file to save the dataframe ‘df’. Can anyone please help me in this scenario.

(Serge Gorshkov) #475

Hi. There is a topic explaining how to do a single image classsification with the simple models from Part 1: How do we use our model against a specific image?

Could someone please explain how to do the same with the model created in this lesson?

(heisenburgzero) #476

Finally put in the effort to chew through papers/blogs regarding Yolo. I implemented the YoloV3 detector forward pass in a single notebook from scratch.

Tried to be more “oop” after watching Lesson 12 darknet building tips.

But I still found the code ugly when I need to implement route layers that grab feature maps from earlier layers, because I would still need the config file from the original Yolo Repo to know where to connect it. I created a dirty function that maps the layers to the entries inside the config file, but I could feel there’s a better way to do it. Any tips on improving?

I’m still trying to figure out the loss function of YoloV3. Doesn’t seem to have a lot chatter about it on the internet. I guess the only way to proceed would be prying through the C source code from the original repo.

(Jeremy Howard) #477

Check out the dynamic unet in fastai - that handles cross connections automatically.

(heisenburgzero) #478

Thanks. I haven’t got to the unet part of the course. Took a quick look, it seems the main idea is to acknowledge where these cross connections happen when initializing the architecture, so I can create a list/dict of these cross connections and loop over them in the forward call.

My initial idea was leaving the possibility that I could connect these layer anywhere I want at any time, but in this case these connections only exists at the end of each stride (right before downsampling). Gonna read the code again this weekend. Thanks again.

(Mykola) #479

Hi, I tried to implement YOLOv3 paper in clean pythonic way in PyTorch. If someone is struggling to understand object detection or YOLOv3 have a look at it

(Serge Gorshkov) #480

Let me answer my question. There is a rough but working way to reason on a single image:

  1. Comment the 3rd line in show_nmf_single function:

def show_nmf_single(idx=0):

bbox,clas = get_y(y[0][idx], y[1][idx])

  1. Do the following:

trn_tfms, val_tfrms = tfms_from_model(f_model, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
im = val_tfrms(open_image(f’{PATH}/image.jpg’))
batch = md.val_dl.np_collate([im])
batch = T(batch, cuda=False).contiguous()
x = to_gpu(batch)
batch = learn.model(V(x))
b_clas,b_bb = batch
x = to_np(x)


I tried the following after training the bboxes and it didn’t work:

trn_tfms, val_tfms = tfms
ds = FilesIndexArrayDataset(["000046.jpg"], np.array([0]), val_tfms, PATH)
dl = DataLoader(ds)
preds = learn2.predict_dl(dl)

TypeError: object of type ‘numpy.int64’ has no len()

But the following works:

trn_tfms, val_tfms = tfms
ds = FilesIndexArrayDataset(["000046.jpg"], np.array([0]), val_tfms, PATH)
dl = DataLoader(ds)
preds = learn2.predict_dl(dl)


array([[ 35.4234 ,  97.52946,  64.6991 , 136.12167]], dtype=float32)

However, the rectangle it predicted is WAY off. I also tried training my model using tfms_from_model(arch, sz) instead of tfms = tfms_from_model(arch, sz, crop_type=CropType.NO, tfm_y=tfm_y, aug_tfms=augs) and the model doesn’t even fit the training set very well (the boxes are way off as well.)

Any ideas?