Fastai v2 chat

Generally the way I do things is to start with the simplest possible API, and a good set of tests, and then refactor it from there. Ideally, I think when we save a model that includes the optimizer (which is an option in v1, and we should probably do the same in v2) it should be possible to continue training it after loading it later.

Frankly, I never quite got that working well, so any help would be most welcome!

1 Like

EDIT Please ignore this question! i am just confused :slight_smile: - just leaving it here in case anyone else is confused like me

I’m just startng to look around and was wondering about this snippet (see below) from the [01_core]

(http://localhost:8888/notebooks/dev/01_core.ipynb) notebook.

should the first functools.wraps be:

@functools.wraps(old_new)

snippet:

#export
class NewChkMeta(PrePostInitMeta):
    "Metaclass to avoid recreating object passed to constructor (plus all `PrePostInitMeta` functionality)"
    def __new__(cls, name, bases, dct):
        x = super().__new__(cls, name, bases, dct)
        old_init,old_new = x.__init__,x.__new__

        @functools.wraps(old_init)
        def _new(cls, x=None, *args, **kwargs):
            if x is not None and isinstance(x,cls):
                x._newchk = 1
                return x
            res = old_new(cls)
            res._newchk = 0
            return res

        @functools.wraps(old_init)
        def _init(self,*args,**kwargs):
            if self._newchk: return
            old_init(self, *args, **kwargs)

        x.__init__,x.__new__ = _init,_new
        return x

Thank you for the advice about the workflow of simple API → tests → refactoring, I love it and do practise in a similar fashion. :metal:
For v2’s model loading/saving mechanism, I will definitely try my best to contribute. :vulcan_salute:

answering my own question, making the change i suggested breaks the tests immediately following so clearly i need to better understand what is going on - please ignore

my guess:
i guess the goal is to make __new__ look like a constructor (or rather have the same signature as the __init__ constructor) instead of an allocator… this probably makes inspection in notebooks more natural - especially for the objects that look like functions

@313V I’ve found that defining __new__ seems to break the signature of the class and all subclasses - it uses the signature of __new__ instead of __init__, although the former is generally *args,**kwargs or something similar. So there’s a couple of places where I work around that problem. I don’t know if there’s a better solution - I haven’t found anything online.

2 Likes

I’ve added an index to useful topics to the FAQ

1 Like

I’ve removed the Transform feature that uses the return type annotation to automatically cast the result to the return type. We used to need it to avoid problems with unwanted type casts, but we’ve now made it so it’s unnecessary. So if you want to cast o to type T in your transform, just use T(o) (assuming that T.__init__ works that way).

There is still one place that return type annotation is used in Transform: use return annotation None to specify that you want to disable any casting to subclass in your transform. (Note that I don’t think we’ve ever actually needed this yet in fastai - it’s just there “in case”, but you probably don’t need to know about it).

1 Like

I don’t know if this has been discussed but I remember that @jeremy mentioned in the first walk-thru that fastai v2 did not have a deadline, unlike other versions, which had to be ready for the course. Is there not a fastai course this fall? If so, does fastai v2 have to be done by then? Or is there something I misunderstood?

1 Like

There is not. There should be one in March 2020.

8 Likes

Sounds good. Thanks for the clarification!

@jeremy/@sgugger mentioned in the blog/course/forum that everything in v2 will be in a notebook, and every feature or pull request will have a notebook to demonstrate and test it.

So this is great, it terms of showing how things should work.

But the knowledge about common yet wrong things to do things, anything from passing the wrong data type to a function, and to high level bad practices, is usually in the forum posts and not in the code itself. (this is not specific to fastai, it’s in most projects).

Is there a plan to incorporate “extremely verbose exceptions”, those that don’t just say “invalid data type”, but actually suggest what is probably the right one and give references and explanation on the situation?

Whenever I use a library, and get an exception, and then find such an explanation inside it, it’s a huge time saver and relief. On the other hand, getting super cryptic exception from a library, with an equally crypt documentation, just to search and find a StackOverflow question with a million participates, since everyone get this error, is quite frustrating. Pretty sure there are such posts on the forums here as well. So let’s reduce frustration.

Thanks :slight_smile:

I think that L should be a library on its own. I woudl like to use it elsewhere but not have to install the whole of fastai as a dependency. Are you planning on making it its own pip package?

2 Likes

Just a cross-reference: a similar issue has been addressed in Fastai v2 code walk-thru 5

WSL 2 worked fine for me - I loaded the Ubuntu 18.04 LTS from the Windows store and installed Anaconda - then the fastai environment could be built. Then Jupyter Notebooks can then be used from the Windows browser. I did see an issue with the CUDA calls, but expected as WSL 2 has no GPU access.

2 Likes

So first of all I just want to say that, after watching the last code walkthrough the use of Python metaclass attributes is brilliant! I had no idea Python was this flexible!
Also I would like to ask, do you plan for fastai v2 to support pruning weights or conv filters? I’d be interested in helping to implement pruning techniques

1 Like

I haven’t done any work on pruning. I’d be happy to consider contributions in that area, although I’d suggest first doing it in a separate repo and we could see whether closer integration would be useful later.

2 Likes

I noticed that there isn’t any learning rate finder yet in v2. I would like to contribute that if no one else is already working on this. Also, is there a general todo list or should people just start contributing whatever they see missing?

2 Likes

There is a bunch of TODOs.
Find with ack "#TODO".

Here is the current output:

19:21 $ ack "#TODO" *.ipynb
03_data_pipeline.ipynb
597:    "#TODO: method examples"
1170:    "#TODO: do something here\n",

05_data_core.ipynb
1478:    "#TODO: make the above check a proper test"

07_vision_core.ipynb
45:    "#TODO: investigate"
248:    "#TODO function to resize_max all images in a path (optionally recursively) and save them somewhere (same relative dirs if recursive)"
637:    "#TODO explain and/or simplify this\n",
942:    "#TODO: Transform on a whole tuple lose types, see if we can simplify that?\n",
1069:    "#TODO tests\n",

09_vision_augment.ipynb
786:    "#TODO test"
869:    "#TODO: test"

19_callback_mixup.ipynb
115:    "#TODO: make less ugly\n",

30_text_core.ipynb
885:    "#TODO: test + rework\n",
940:    "class SentencePieceTokenizer():#TODO: pass the special tokens symbol to sp\n",

_42_tabular_rapids.ipynb
159:    "#TODO Categorical\n",

50_data_block.ipynb
243:    "#TODO: access vocab\n",
758:    "    #TODO: dupe code from DataBlock.databunch and DataSource.databunch to refactor\n",

92_notebook_showdoc.ipynb
711:    "    link = get_source_link(elt) #TODO: use get_source_link when it works\n",

_tabular_fast.ipynb
648:    "        df[n] = df[n].fillna(df[n].mean()) #TODO: request median\n",
4 Likes

The learning rate finder is defined in notebook 14 with the schedulers.
As for contributions, the most helpful for us now is trying to use the library and point us to bugs/things that are unclear or behave weirdly. The TODO notes are mostly for Jeremy and me and may not be very understandable.

4 Likes

Thanks! I am trying to use the library and didn’t find the LRFinder, because it is not in the docs. Do the docs just need a rebuild?

Just so I understand this correctly: Jeremy and you are mainly looking for feedback, but not contributions (which is absolutely fine, just want to make sure I understand correctly)?