Fastai v2 code walk-thru 5

(Jeremy Howard (Admin)) #1

Please edit this wiki topic :slight_smile:

Fastai v2 daily code walk-thrus
Fastai v2 chat

Sep 10, 2019

Digging Deep in Transforms

  • 02_data_transforms.ipynb notebook
  • no order in v2 of the library for DataBlock API so can’t have mistakes when order is not correct as it was in v1
  • filt knows which set transform is applied to
  • transform creates a callable object, after creating you should call it and pass some data
  • _TfmMeta is meta class of Transform
  • ... for pass in Python (or do nothing)
  • Foo2 = type(‘Foo2’, (object), {}) will create a class called Foo2
  • type is class that constructs things
  • reason for meta classes is to change how Python works underneath the hood
  • meta class is a way to not use type constructor, but some other constructor in fastai v2
  • meta class is called every time we create a class (not when objects are created)
  • :star::star::star:3. Data model — Python 3.7.4 documentation
  • class’s namespace is __dict__ object
  • _TfmMeta uses _TfmDict instead of plain dict as the namespace dict
  • it behaves as the normal dict except for cases when ‘encodes’ and ‘decodes’ parameters are passed
  • TypeDispatch — allows for the same function to work differently for different types
  • Python method dispatching — can be single or multiple
  • Multiple Dispatching — Python 3 Patterns, Recipes and Idioms
  • Method dispatching in Python
  • PEP 443 — Single-dispatch generic functions | Python.org
  • TypeDispatch has dict funcs where key is a type and value is the function to call (NICE!)
  • _p1_anno grabs the first parameter annotation
  • how to use Transform subclass as a decorator? This will allow to dynamically add functionality to a given transform later outside the class definition.
  • Using class as decorator needs to be a callable, but not in the __init__ sense. We need to reimplement __call__() method. If parameter is callable it will be added to class’s methods.
  • __new__() should have an expectable Shift + Tab autocomplete for the signature. Signature is customized in __new__() for Transform
0 Likes

Fastai v2 daily code walk-thrus
Fastai v2 chat
(Maxim Pechyonkin) #2

These are my notes in random order from today:

Sep 10, 2019

Digging Deep in Transforms

  • 02_data_transforms.ipynb notebook
  • no order in v2 of the library for DataBlock API so can’t have mistakes when order is not correct as it was in v1
  • filt knows which set transform is applied to
  • transform creates a callable object, after creating you should call it and pass some data
  • _TfmMeta is meta class of Transform
  • ... for pass in Python (or do nothing)
  • Foo2 = type(‘Foo2’, (object), {}) will create a class called Foo2
  • type is class that constructs things
  • reason for meta classes is to change how Python works underneath the hood
  • meta class is a way to not use type constructor, but some other constructor in fastai v2
  • meta class is called every time we create a class (not when objects are created)
  • :star::star::star:3. Data model — Python 3.7.4 documentation
  • class’s namespace is __dict__ object
  • _TfmMeta uses _TfmDict instead of plain dict as the namespace dict
  • it behaves as the normal dict except for cases when ‘encodes’ and ‘decodes’ parameters are passed
  • TypeDispatch — allows for the same function to work differently for different types
  • Python method dispatching — can be single or multiple
  • Multiple Dispatching — Python 3 Patterns, Recipes and Idioms
  • Method dispatching in Python
  • PEP 443 — Single-dispatch generic functions | Python.org
  • TypeDispatch has dict funcs where key is a type and value is the function to call (NICE!)
  • _p1_anno grabs the first parameter annotation
  • how to use Transform subclass as a decorator? This will allow to dynamically add functionality to a given transform later outside the class definition.
  • Using class as decorator needs to be a callable, but not in the __init__ sense. We need to reimplement __call__() method. If parameter is callable it will be added to class’s methods.
  • __new__() should have an expectable Shift + Tab autocomplete for the signature. Signature is customized in __new__() for Transform
7 Likes

Fastai v2 daily code walk-thrus
(Mike Tian-Jian Jiang) #3

Jeremy has also mentioned this issue briefly here: Fastai v2 chat

AFAIK, because if __new__() is defined, it always goes before __init__(), and when creating a new subclass, the signature still belongs to its superclass if subclass didn’t define its own __new__().

class A():
    def __new__(cls, *args, **kwargs): return super().__new__(cls, *args, **kwargs)

class B(A):
    def __init__(self, a): self.a = a

B(a=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<some_obj_hash> in <module>
----> 1 B(a=1)

<another_obj_hash> in __new__(cls, *args, **kwargs)
      1 class A():
----> 2     def __new__(cls, *args, **kwargs): return super().__new__(cls, *args, **kwargs)

TypeError: object.__new__() takes exactly one argument (the type to instantiate)

The reasoning behind it is often pointed to Liskov substitution principle, despite Python is keen to duck-typing. Plus, this way seems more convenient for Python core developers when dealing with MRO (Method Resolution Order) and/or underlying C functions.

To imitate Jeremy’s solution with a non-meta superclass like A here, one way is to change subclass B verbosely:

class B(A):
    def __new__(cls, a, *args, **kwargs):
        res = super().__new__(cls, *args, **kwargs)
        res.__signature__ = inspect.signature(res.__init__)
        return res
    def __init__(self, a): self.a = a

Not only it’s annoying and error-prone to change subclasses, but B's signature will have three arguments a, *args, **kwars rather than just one a. So probably another reason to have a meta superclass and change the signature in it.

1 Like

(Aman Arora) #4

In today’s walkthrough, Jeremy mentioned that if we done pass enc or dec, they would have already been created for us.
Anybody figure out how yet?

He mentioned this here

I think this has to do with TypeDispatch?

0 Likes

Underlying `Transforms` machinery and `MetaClasses`
(Aman Arora) #6

Is there anyone else who is interested about learning exactly how transforms work in V2? If so, maybe we can set up a call?

0 Likes

(Vijay Narayanan Parakimeethal) #8

Hi Aman, I did a bit of debugging on Transform and posted something as a follow up of my earlier query in the code walkthru forum. You can find it here .

The answer in my understanding lies in _TfmMeta wherein the call function sets cls.encodes and cls.decodes if you don’t pass along an enc or dec.

%%debug
bg = Transform()

If you follow the debugger along you will see that at the call of _TfmMeta the cls (which is the Transform class) has no attribute called encodes or decodes. Then the call sets cls.encodes and cls.decodes to be TypeDispatch(). Hopefully this clarifies.

NOTE: Enter 'c' at the ipdb>  prompt to continue execution.
None
> <string>(2)<module>()

ipdb>  s
--Call--
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(122)__call__()
    120         return res
    121 
--> 122     def __call__(cls, *args, **kwargs):
    123         f = args[0] if args else None
    124         n = getattr(f,'__name__',None)

ipdb>  self
*** NameError: name 'self' is not defined
ipdb>  cls
<class 'local.data.transform.Transform'>
ipdb>  cls.encodes
*** AttributeError: type object 'Transform' has no attribute 'encodes'
ipdb>  cls.decodes
*** AttributeError: type object 'Transform' has no attribute 'decodes'
ipdb>  args
cls = <class 'local.data.transform.Transform'>
args = ()
kwargs = {}
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(123)__call__()
    121 
    122     def __call__(cls, *args, **kwargs):
--> 123         f = args[0] if args else None
    124         n = getattr(f,'__name__',None)
    125         if not hasattr(cls,'encodes'): cls.encodes=TypeDispatch()

ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(124)__call__()
    122     def __call__(cls, *args, **kwargs):
    123         f = args[0] if args else None
--> 124         n = getattr(f,'__name__',None)
    125         if not hasattr(cls,'encodes'): cls.encodes=TypeDispatch()
    126         if not hasattr(cls,'decodes'): cls.decodes=TypeDispatch()

ipdb>  f
ipdb>  getattr(f,'__name__',None)
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(125)__call__()
    123         f = args[0] if args else None
    124         n = getattr(f,'__name__',None)
--> 125         if not hasattr(cls,'encodes'): cls.encodes=TypeDispatch()
    126         if not hasattr(cls,'decodes'): cls.decodes=TypeDispatch()
    127         if isinstance(f,Callable) and n in ('decodes','encodes','_'):

ipdb>  hasattr(cls,'encodes')
False
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(126)__call__()
    124         n = getattr(f,'__name__',None)
    125         if not hasattr(cls,'encodes'): cls.encodes=TypeDispatch()
--> 126         if not hasattr(cls,'decodes'): cls.decodes=TypeDispatch()
    127         if isinstance(f,Callable) and n in ('decodes','encodes','_'):
    128             getattr(cls,'encodes' if n=='_' else n).add(f)

ipdb>  self.encodes
*** NameError: name 'self' is not defined
ipdb>  cls.encodes
{}
ipdb>  hasattr(cls,'decodes')
False
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(127)__call__()
    125         if not hasattr(cls,'encodes'): cls.encodes=TypeDispatch()
    126         if not hasattr(cls,'decodes'): cls.decodes=TypeDispatch()
--> 127         if isinstance(f,Callable) and n in ('decodes','encodes','_'):
    128             getattr(cls,'encodes' if n=='_' else n).add(f)
    129             return f

ipdb>  cls.decodes
{}
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(130)__call__()
    128             getattr(cls,'encodes' if n=='_' else n).add(f)
    129             return f
--> 130         return super().__call__(*args, **kwargs)
    131 
    132     @classmethod

ipdb>  super
<class 'super'>
ipdb>  s
--Call--
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(139)__init__()
    137     "Delegates (`__call__`,`decode`) to (`encodes`,`decodes`) if `filt` matches"
    138     filt,init_enc,as_item_force,as_item,order = None,False,None,True,0
--> 139     def __init__(self, enc=None, dec=None, filt=None, as_item=False):
    140         self.filt,self.as_item = ifnone(filt, self.filt),as_item
    141         self.init_enc = enc or dec

ipdb>  self
Transform: True {} {}
ipdb>  enc
ipdb>  dec
ipdb>  as_item
False
ipdb>  self.as_item
True
ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(140)__init__()
    138     filt,init_enc,as_item_force,as_item,order = None,False,None,True,0
    139     def __init__(self, enc=None, dec=None, filt=None, as_item=False):
--> 140         self.filt,self.as_item = ifnone(filt, self.filt),as_item
    141         self.init_enc = enc or dec
    142         if not self.init_enc: return

ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(141)__init__()
    139     def __init__(self, enc=None, dec=None, filt=None, as_item=False):
    140         self.filt,self.as_item = ifnone(filt, self.filt),as_item
--> 141         self.init_enc = enc or dec
    142         if not self.init_enc: return
    143 

ipdb>  n
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(142)__init__()
    140         self.filt,self.as_item = ifnone(filt, self.filt),as_item
    141         self.init_enc = enc or dec
--> 142         if not self.init_enc: return
    143 
    144         # Passing enc/dec, so need to remove (base) class level enc/dec

ipdb>  self.init_enc
ipdb>  not self.init_enc
True
ipdb>  bool(self.init_enc)
False
ipdb>  n
--Return--
None
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(142)__init__()
    140         self.filt,self.as_item = ifnone(filt, self.filt),as_item
    141         self.init_enc = enc or dec
--> 142         if not self.init_enc: return
    143 
    144         # Passing enc/dec, so need to remove (base) class level enc/dec

ipdb>  n
--Return--
Transform: False {} {}
> /Users/i077725/Documents/GitHub/fastai_dev/dev/local/data/transform.py(130)__call__()
    128             getattr(cls,'encodes' if n=='_' else n).add(f)
    129             return f
--> 130         return super().__call__(*args, **kwargs)
    131 
    132     @classmethod

ipdb>  n
--Return--
None
> <string>(2)<module>()

ipdb>  n
0 Likes

(Vijay Narayanan Parakimeethal) #9

TypeDispatch

I did a bit of checking around how TypeDispatch() works. I used the example of flip_img used in code walkthrough 4 wherein we set the same to work for MyTensorImage type.

Then I did a %%debug on using an instance of Transform() that calls the flip_img function to apply only on MyTensorImage type.

%%debug
flip_t = Transform(flip_img)

The working of how the Transform constructs flip_img so that it is invoked via encodes function when you call flip_t() is explained using a similar example here. I will explain as to how the TypeDispatch() works here in the below post.

Let us come to the init function of Transform which is called by the call function of _TfmMeta. Here the enc is the flip_img function.

def __init__(self, enc=None, dec=None, filt=None, as_item=False):
        self.filt,self.as_item = ifnone(filt, self.filt),as_item
        self.init_enc = enc or dec
        if not self.init_enc: return

        # Passing enc/dec, so need to remove (base) class level enc/dec
        del(self.__class__.encodes,self.__class__.decodes)
        self.encodes,self.decodes = (TypeDispatch(),TypeDispatch())
        if enc:
            self.encodes.add(enc)
            self.order = getattr(self.encodes,'order',self.order)
        if dec: self.decodes.add(dec)

Let’s specifically focus on

# Passing enc/dec, so need to remove (base) class level enc/dec
            del(self.__class__.encodes,self.__class__.decodes)
            self.encodes,self.decodes = (TypeDispatch(),TypeDispatch())
            if enc:
                self.encodes.add(enc)

Here the encodes and decodes of Transform is deleted in first line and recreated as TypeDispatch() again in the second line. The init of TypeDispatch() takes in a list of functions and adds them to the attribute self.funcs. In the absence of any functions, the self.funcs is an empty dictionary. The self.cache is also an empty dictionary as there no functions during the init here.

def __init__(self, *funcs):
        self.funcs,self.cache = {},{}
        for f in funcs: self.add(f)
        self.inst = None

Let’s now look at these lines in the init of Transform.

if enc:
        self.encodes.add(enc)

We know that enc is flip_img function. This is now added to self.encodes via self.encodes.add(enc). We know that there is no add function in Transform. It is there in TypeDispatch()

def add(self, f):
        "Add type `t` and function `f`"
        self.funcs[_p1_anno(f) or object] = f
        self._reset()

Here the _p1_anno(flip_img) returns class '__main__.MyTensorImage'. Therefore we have

self.funcs[class '__main__.MyTensorImage'] = function flip_img at 0x1302d6dd0

The output is like this

ipdb>  self.funcs
{<class '__main__.MyTensorImage'>: <function flip_img at 0x1302d6dd0>}

In self._reset the _reset function of TypeDispatch() is called. This again sets all the keys and values of self.funcs in reverse order using cmp_instance key and then adds all of them to self.cache as well.

def _reset(self):
        self.funcs = {k:self.funcs[k] for k in sorted(self.funcs, key=cmp_instance, reverse=True)}
        self.cache = {**self.funcs}

The debugger output on each line of this code is shown below.

From here on the rest of the code in init of Transform gets run. But this gives a good way to look at how TypeDispatch functions.

0 Likes

(Aman Arora) #10
0 Likes

(Michael) #11

This is also a nice visual explanation of the python data model:
https://delapuente.github.io/presentations/python-datamodel/#/

2 Likes

(Konstantin Dorichev) #12

To be exact, _p1_anno() returns the first of one or more parameters with type annotation.
Test case with 2nd parameter annotated:

def _f(a, b:str)->float: pass
test_eq(_p1_anno(_f), str)

I have also added a test case to indicate that if two parametes attotated, the first will be returned:

1 Like

(Konstantin Dorichev) #13

Thanks for your notes, Maxim. If you’d like contribution, you may want to copy/move it into the top post, which is a wiki.

1 Like

(Maxim Pechyonkin) #14

Sorry, I am not very experienced with wiki posts. I moved the notes to the top post. I think now anyone can edit to clarify things. I am pretty sure there are a lot of things I did not get right, so clarifications/corrections are most welcome.

2 Likes