A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

muellerzr · February 17, 2020, 6:29pm

Possibly. I’m unsure. Once we have a single example you’re welcome to try

muellerzr · February 17, 2020, 6:54pm

@s.s.o one option may be a multi-headed model (which this would actually be a really cool example to do this. I may test it out.)

Basically it would work as such:

We define a custom head in which it has two final linear layers. The first of which would have 1 outputs (our number) and the other would have 2 (binary) and we pass the final layer before it to each and return both (if that makes sense to you)

s.s.o · February 17, 2020, 6:56pm

@muellerzr,

Yes you may need multi-head and multi-loss function and may (weighted) average it …

muellerzr · February 17, 2020, 6:58pm

I’d think just sum them together at the end (this is commonly done with models I’ve seen)

muellerzr · February 17, 2020, 7:07pm

By the way, a lot easier than you would think @s.s.o I’ll provide an example notebook instead of covering it in the class and post that shortly. wiki_crop (the wiki part of IMDB Wiki) subsets the labels into folders. As such we can make a to_num transform that takes this label and turns it into a float (or int):

def to_num(x:str): return int(x)
block = DataBlock(blocks=(ImageBlock, RegressionBlock()),
                  get_items=get_image_files,
                  splitter=RandomSplitter(),
                  get_y=[parent_label, to_float],
                  item_tfms=RandomResizedCrop(460),
                  batch_tfms=[*aug_transforms(size=224, max_warp=0)])

Only bit is RegressionBlock doesn’t have a show method. So we’ll rely on block.summary() to ensure everything is working how we want.

Then we can do: dls.c = 1 and pass this into cnn_learner with our loss function as MSELossFlat

learn = cnn_learner(dls, resnet34, loss_func=MSELossFlat())

s.s.o · February 17, 2020, 7:14pm

@muellerzr, Great

yes regression is not complete yet. you can use TitledFloat. For now I use my custom classes wich shows…

class TitledFloatShort(Float, ShowTitle):
    _show_args = {'label': 'text'}
    def show(self, ctx=None, **kwargs):
        "Show self"
        return show_title(f'{self:.2f}', ctx=ctx, **merge(self._show_args, kwargs))

class ToFloatTensor(Transform):
    "Transform to float tensor"
    order = 10 #Need to run after PIL transforms on the GPU
    _show_args = {'label': 'text'}
    def __init__(self, split_idx=None, as_item=True):
        super().__init__(split_idx=split_idx,as_item=as_item)

    def encodes(self, o): return o.astype(np.float32)
    def decodes(self, o): return TitledFloatShort(o)

def FloatBlock(vocab=None, add_na=False):
    "`TransformBlock` for single-label float targets"
    return TransformBlock(type_tfms=ToFloatTensor())

navneetkrch · February 17, 2020, 7:44pm

What/How are you doing model interpretation for Tabular Regression?
I changed the task as classification and used SHAP for interpretation of the Tabular Regression.

muellerzr · February 17, 2020, 8:19pm

@s.s.o I will add this still doesn’t quite work, as parent_label will still do it as a string. To fix this:

def encodes(self, o): return o.float()

To which then we can do:

block = DataBlock(blocks=(ImageBlock, RegressionBlock()),
                  get_items=get_image_files,
                  splitter=RandomSplitter(),
                  get_y=[parent_label, to_num, ToFloatTensor],
                  item_tfms=RandomResizedCrop(460),
                  batch_tfms=[*aug_transforms(size=224, max_warp=0)])

And then we can declare our model as:

learn = cnn_learner(dls, resnet34, loss_func=MSELossFlat(), y_range=(0,100))

I’ve added a Scalar Regression notebook showing this end to end

s.s.o · February 17, 2020, 9:11pm

@muellerzr, I use pandas dataframe and they are float64.You are right it does not handle strings, we should type check for different data types. You may also normalize the age ?

muellerzr · February 17, 2020, 9:45pm

I think normalizing the age is a bad choice here as we normalize input variables, not output variables (look at Rossmann, this never happens, instead we log() it because they’re high values)

s.s.o · February 17, 2020, 10:01pm

Yep, that’s what I mean …

muellerzr · February 17, 2020, 10:04pm

Try it

s.s.o · February 17, 2020, 10:09pm

def decodes(self, o): return TitledFloatShort(o) should be corrected as TitledNumberShort I’ll test it

mgloria · February 17, 2020, 10:32pm

@muellerzr and @barnacl - I finally got it! First of all, thanks a lot @barnacl for the good advice. I believe you raised very good points: to be honest, I could not find info about how vmin and vmax work so I was borrowing @muellerzr defaults and I am not very happy with them in terms of visualization (so feel free to jump on here if you have more info). I feel it could look prettier
The way of calculating IoU was indeed a conscious choice.

As for the error, it was indeed what @muellerzr hinted. Thanks a lot! I was generating the masks with coco.annToMask(ann). They must be saved as .png and not .jpg. This is so important!! To anyone, remember this and you will save yourselves much time in the future. The following was happening:

Background class is code ‘0’, chocolate ‘10’. So it was creating artifacts in the borders and generating many other values in the mask beyond background and the corresponding food class. For this reason, the number of codes was not matching!!

barnacl · February 18, 2020, 12:19am

https://matplotlib.org/api/_as_gen/matplotlib.pyplot.imshow.html
When using scalar data and no explicit *norm* , *vmin* and *vmax* define the data range that the colormap covers. By default, the colormap covers the complete value range of the supplied data. *vmin* , *vmax* are ignored if the *norm* parameter is used.
Oh yes i rem that png for the mask is very important or else jpeg compression screws thing up.
Something similar was when you resize you should always use nearest neighbour for the masks (currently that is handled with type dispatch).

muellerzr · February 18, 2020, 2:10am

@harikrishnanrajeev we’ve actually already covered all the concepts needed for SuperRes. Look at the course NB and you’ll see it’s feature loss like we did for style transfer (but with a unet)

https://github.com/fastai/fastai2/blob/master/nbs/course/lesson7-superres.ipynb

hello34 · February 18, 2020, 3:56am

@muellerzr just saw your interview with Sanyanam yesterday. Would love to follow your course?

muellerzr · February 18, 2020, 3:58am

We’re up to lesson 5 now, you can find all the relevant links to the courses and notebooks on the first post

foobar8675 · February 18, 2020, 5:30am

@muellerzr i’m looking at 05_Style_Transfer and when i print out
vgg19(pretrained=True).features

there is a relu on layer 8. but it doesn’t look like it’s being used on this line

layers = [feat_net[i] for i in [1, 6, 11, 20, 29, 22]]; layers

and i was wondering if you know why?

barnacl · February 18, 2020, 6:24am

@foobar8675 i think it is more of an art than science to this (probably should experiment and see how changing this changes the output, on my todo list)
If you look at the lecture that has been linked, Jeremy suggests using the layer right before any type of pooling layers as they contain the max information at that given grid size.