Fastai v2 code walk-thru 1

radek · September 12, 2019, 1:03pm

This seems fairly important: why is RandomSplitter camel cased despite being just a function?

The naming convention that was adopted is that if Something when called will return a callable (anything that can be called), Something will be camel cased.

Link to relevant part of the walkthrough

A trivial example:

radek · September 12, 2019, 1:32pm

Check out the amazing functionality of TypeDispatch!

sgugger · September 12, 2019, 1:50pm

It’s magic

radek · September 12, 2019, 2:29pm

This magic was not even that arcane since I figured out how it works from reading the class definition

class TypeDispatch:
    "Dictionary-like object; `__getitem__` matches keys of types using `issubclass`"
    def __init__(self, *funcs):
        self.funcs,self.cache = {},{}
        for f in funcs: self.add(f)
        self.inst = None

    def _reset(self):
        self.funcs = {k:self.funcs[k] for k in sorted(self.funcs, key=cmp_instance, reverse=True)}
        self.cache = {**self.funcs}

    def add(self, f):
        "Add type `t` and function `f`"
        self.funcs[_p1_anno(f) or object] = f
        self._reset()

    def returns(self, x): return anno_ret(self[type(x)])
    def returns_none(self, x):
        r = anno_ret(self[type(x)])
        return r if r == NoneType else None

    def __repr__(self): return str({getattr(k,'__name__',str(k)):v.__name__ for k,v in self.funcs.items()})

    def __call__(self, x, *args, **kwargs):
        f = self[type(x)]
        if not f: return x
        if self.inst: f = types.MethodType(f, self.inst)
        return f(x, *args, **kwargs)

    def __get__(self, inst, owner):
        self.inst = inst
        return self

    def __getitem__(self, k):
        "Find first matching type that is a super-class of `k`"
        if k in self.cache: return self.cache[k]
        types = [f for f in self.funcs if issubclass(k,f)]
        res = self.funcs[types[0]] if types else None
        self.cache[k] = res
        return res

On the other hand, how is the returned object cast based on type annotation in decodes?!

Now that is some powerful wizardry I tried looking in Transform, retain_type, TypeDispatch and _TfmDict but got nothing Quite sure it will be explained later down the road though

As they say, curiosity killed the cat, or at least the time the cat was supposed to put into following the walk through, but instead it got attracted to the shiny low level functionality

sgugger · September 12, 2019, 2:38pm

We removed that part, you are supposed to cast the object yourself now. If you look at the latest version of PetTfm, it reads:

class PetTfm(Transform):
    def __init__(self, vocab, o2i, lblr): self.vocab,self.o2i,self.lblr = vocab,o2i,lblr
    def encodes(self, o): return resized_image(o), self.o2i[self.lblr(o)]
    def decodes(self, x): return TitledImage(x[0], self.vocab[x[1]]

radek · September 12, 2019, 2:42pm

Oh no! So type annotations no longer work like that

Thank you so much for the heads up on that, makes a lot of sense now!

niazangels · September 12, 2019, 8:21pm

This bit got me! I was revising notebook 08 and did a git pull and noticed that all the annotations have gone away.

I think means that things really are changing very fast and we better keep up. There sure is a ton to learn from these lessons

Was there a specific reason implicit type casting was discarded?

sgugger · September 12, 2019, 8:44pm

We used to use the type annotations before we introduced the subclasses of Tensors/PIL.Images etc. Now that the object have the proper types, there is no need for them anymore and it can be confusing that a function with a type annotation Foo doesn’t return an object of type Foo.

radek · September 13, 2019, 7:18am

This is Python on a completely new level with a meaning density of code beyond anything I have imagined possible

This solves why ToTensor which is only this

class ToTensor(Transform):
    "Convert item to appropriate tensor class"
    order = 15

can suddenly do so many things.

And here is the original code in local/vision/core.py giving ToTransform its new powers, thanks to _TfmMeta of Transform

@ToTensor
def encodes(self, o:PILImage): return TensorImage(image2byte(o))
@ToTensor
def encodes(self, o:PILImageBW): return TensorImageBW(image2byte(o))
@ToTensor
def encodes(self, o:PILMask):  return TensorMask(image2byte(o)[0])

dhoa · September 25, 2019, 2:43pm

I just love this kind of literate programming. There are many times I need to go back and forth the notebook to experiment things. Now, I can stay safely in the notebook and everything is created automatically. I am trying to integrate it to others projects, and write here how it works (use just notebook_core and notebook_export) https://medium.com/@dienhoa.t/fast-ai-literature-programming-2d0d4230dd81 . This is very very shallow but anyway, I hope someone can find it useful

jeremy · September 25, 2019, 4:38pm

Thanks for the article @dhoa! BTW the title has a typo.

dhoa · September 25, 2019, 4:57pm

Thanks @jeremy

MohammedHasan · January 3, 2020, 5:52am

class RegexLabeller():
“Label item with regex pat.”
def init(self, pat, match=False):
self.pat = re.compile(pat)
self.matcher = self.pat.match if match else self.pat.search

def __call__(self, o, **kwargs):
    res = self.matcher(str(o.as_posix()))
    assert res,f'Failed to find "{self.pat}" in "{o}"'
    return res.group(1)

If RegexLabeller is chaned like that, then it will work both windows and Linux. It is only a small change, when converting Pathlib path to string. The String conversion needs to change from str(o) to str(o.as_posix())

MohammedHasan · January 3, 2020, 5:54am

class RegexLabeller():
    "Label `item` with regex `pat`."
    def __init__(self, pat, match=False):
        self.pat = re.compile(pat)
        self.matcher = self.pat.match if match else self.pat.search

    def __call__(self, o, **kwargs):
        res = self.matcher(str(o.as_posix()))
        assert res,f'Failed to find "{self.pat}" in "{o}"'
        return res.group(1)

Change RegexLabeller to that, it will wok fine in windows also. str(o)–> str(o.as_posix())

jeremy · January 4, 2020, 11:55pm

It’s a good idea, but it assumes the o is a Path, which it might not be. This class is general - it can apply to anything that can be stringified.

It would however make sense to add this to places that use RegexLabeller.

MohammedHasan · January 5, 2020, 11:01am

oh!! thanks for the clarification

dipam7 · February 4, 2020, 3:27pm

I just watched the first walk-thru at fast speed. I’m going to watch it again and code along. I am really interested in the L data structure that replaces lists. Is it possible to replicate it easily for other programs we can be working on?

jeremy · February 4, 2020, 7:01pm

L is available in fastcore now, so you can use it everywhere!

dipam7 · February 6, 2020, 9:11pm

Thanks. I used it a bit and found it really helpful. I thought other people should know about it. So here: https://medium.com/@dipam44/upgrading-python-lists-35440096ec36

jeremy · February 7, 2020, 4:41am

That’s extremely helpful. I tweeted it (but you don’t have your twitter handle in Medium so it didn’t at-mention you).