A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

mgloria · February 17, 2020, 12:51pm

I just watched last week’s video! Very interesting content @muellerzr , I loved the EfficientNet explanation . For the theory of style transfer, I totally recommend this series for a brilliant and complementary explanation of the theory.

I tried to segment a food dataset (pretty cool!) following the tutorial but I am getting a weird error . Does somebody know what the issue is? (Also, I am looking for people to join my team in this competition )

RuntimeError: CUDA error: device-side assert triggered

github.com

gloriamacia/food-classifier/blob/master/fastai2-segmentation.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai2.basics import *\n",
    "from fastai2.vision.all import *\n",
    "from fastai2.callback.all import *"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = Path('data/')"

This file has been truncated. show original

muellerzr · February 17, 2020, 1:11pm

@mgloria you found one of the most common segmentation issues and I’m also going to talk about it next lecture. I meant to put that in! That means your codes don’t align to the number of classes present in your dataset

IE: I made 4 labels but in reality there are five. If there’s a strange one because they all “should” be there, add one more blank or other category. Should fix it

muellerzr · February 17, 2020, 1:20pm

@vijayabhaskar this comes from a flattening layer. Here is the source:

github.com

fastai/fastai2/blob/master/fastai2/layers.py#L67


class PartialLambda(Lambda):
    "Layer that applies `partial(func, **kwargs)`"
    def __init__(self, func, **kwargs):
        super().__init__(partial(func, **kwargs))
        self.repr = f'{func.__name__}, {kwargs}'


    def forward(self, x): return self.func(x)
    def __repr__(self): return f'{self.__class__.__name__}({self.repr})'


# Cell
@module(full=False)
def Flatten(self, x):
    "Flatten `x` to a single dimension, e.g. at end of a model. `full` for rank-1 tensor"
    return x.view(-1) if self.full else x.view(x.size(0), -1)


# Cell
class View(Module):
    "Reshape `x` to `size`"
    def __init__(self, *size): self.size = size
    def forward(self, x): return x.view(self.size)

And in create_head:

github.com

fastai/fastai2/blob/master/fastai2/vision/learner.py#L83


    else:                           raise NamedError("cut must be either integer or a function")


# Cell
def create_head(nf, n_out, lin_ftrs=None, ps=0.5, concat_pool=True, bn_final=False, lin_first=False, y_range=None):
    "Model head that takes `nf` features, runs through `lin_ftrs`, and out `n_out` classes."
    lin_ftrs = [nf, 512, n_out] if lin_ftrs is None else [nf] + lin_ftrs + [n_out]
    ps = L(ps)
    if len(ps) == 1: ps = [ps[0]/2] * (len(lin_ftrs)-2) + ps
    actns = [nn.ReLU(inplace=True)] * (len(lin_ftrs)-2) + [None]
    pool = AdaptiveConcatPool2d() if concat_pool else nn.AdaptiveAvgPool2d(1)
    layers = [pool, Flatten()]
    if lin_first: layers.append(nn.Dropout(ps.pop(0)))
    for ni,no,p,actn in zip(lin_ftrs[:-1], lin_ftrs[1:], ps, actns):
        layers += LinBnDrop(ni, no, bn=True, p=p, act=actn, lin_first=lin_first)
    if lin_first: layers.append(nn.Linear(lin_ftrs[-2], n_out))
    if bn_final: layers.append(nn.BatchNorm1d(lin_ftrs[-1], momentum=0.01))
    if y_range is not None: layers.append(SigmoidRange(*y_range))
    return nn.Sequential(*layers)


# Cell
from ..callback.hook import num_features_model

muellerzr · February 17, 2020, 1:22pm

Lastly,

For the last lecture (in vision) as it’ll be some more different use cases, what are some vision databunches that people would like examples of?

The most common question I’ve seen is Siamese, what others?

mgloria · February 17, 2020, 1:29pm

I see! but I added at the beginning of the codes the “void” class manually to account for the background… should not this be enough? This is 0 in the mask.

muellerzr · February 17, 2020, 1:31pm

Did you also try not including it too? (I assume so). It’s a hard bug to figure out. There’s a few discussions on debugging it on the forum, I’ll see what I can dig up but basically another way of doing it is finding the maximum pixel value in your mask, add 1 to it and that tells you the number of classes as well.

This is common when we don’t know how the data was prepared beforehand, and it’s an issue that you don’t realize until you’ve started working with the data

@mgloria try incorporating some of the advice here: CUDA error: device-side assert triggered

vijayabhaskar · February 17, 2020, 4:46pm

Siamese would be great! I don’t think there is much left in vision in terms of datablocks you covered it all. Only GANs are left. If I’m not asking too much, maybe Image captioning?

harikrishnanrajeev · February 17, 2020, 4:58pm

If you haven’t covered super resolution, may be “super resolution”. thanks.

muellerzr · February 17, 2020, 5:10pm

Tail end of the series after NLP

barnacl · February 17, 2020, 5:16pm

@mgloria just to point out you need to update your vmin, vmax (won’t solve the CUDA error though)
plus i think you don’t need to mention vmin, vmax. matplotlib will handle it.
You could try to find the unique values in your mask and count that to double check if you are missing something.
The error i was making was my mask was 3d instead of it being 2d.
Any reason why you are not using background in the accuracy calculation ?
mask = targ != void_code - you are saying when we calc accuracy i don’t care where we have background (i’m not sure if you want that ?)
In camvid void was used because there were some pixels which didn’t fall in any of the mask categories.
Could you point out to the dataset or a small subset (maybe upload a few to you github) would like to try out and see if i can help

barnacl · February 17, 2020, 5:18pm

@muellerzr could we image and regression, image and multiple regression columns (not sure if i missed that.

muellerzr · February 17, 2020, 5:27pm

Keypoints will cover this but if you can provide a dataset where it’s an individual number itself (for true regression) I can include that in that lesson too (Tried looking for this but couldn’t find much) if so we’ll do single regression.

barnacl · February 17, 2020, 5:34pm

let me look, i had one for work that was predicting brightness of images.

harikrishnanrajeev · February 17, 2020, 5:37pm

model explainability using LIME , saliency maps

muellerzr · February 17, 2020, 5:45pm

@vijayabhaskar this was fixed in the latest fastcore (just got informed)

s.s.o · February 17, 2020, 6:16pm

@muellerzr, sorry that I could not share the data I have yet. But, I get pretty good results for my problem compared to classical regression (with hand made feature engineering) . Busy writing paper about it too.

from the blog fastai v1 used to estimate age.

IMDB-Wiki Face Dataset ( https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/)
UTK Face Dataset ( http://aicip.eecs.utk.edu/wiki/UTKFace)
Appa Real Face Dataset ( http://chalearnlap.cvc.uab.es/dataset/26/description/)

Also leaf count example is done but don’t know if the data is available…

muellerzr · February 17, 2020, 6:18pm

This is one I’ve definitely heard of

What I’ll likely do is just provide a detailed notebook for those few other examples since I’m not quite sure how they’d fit in time wise, but they’ll be there (if the DataBlock/bunch is complex like Siamese is then I’ll talk about it in class)

s.s.o · February 17, 2020, 6:27pm

It might be useful for future users. I just wonder if it’s possible to get both gender and age as output? With current v2 with minor modifications? Do you have any idea?

muellerzr · February 17, 2020, 6:29pm

Possibly. I’m unsure. Once we have a single example you’re welcome to try

muellerzr · February 17, 2020, 6:54pm

@s.s.o one option may be a multi-headed model (which this would actually be a really cool example to do this. I may test it out.)

Basically it would work as such:

We define a custom head in which it has two final linear layers. The first of which would have 1 outputs (our number) and the other would have 2 (binary) and we pass the final layer before it to each and return both (if that makes sense to you)