Fastai v2 chat

pete1 · September 1, 2019, 9:29pm

Thanks for the info. I don’t think I have seen the new notebook structure. Is there a link that explains it, or an example?

muellerzr · September 1, 2019, 9:33pm

The entire v2 library is built by exporting notebooks. Right now I’m trying to go through and find a few areas that could need better explaining/updating as new developments happen within the library, however 08_pets_tutorial is a great example:

github.com

fastai/fastai_dev/blob/master/dev/08_pets_tutorial.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial: Using `fastai.data` low-level APIs\n",
    "\n",
    "> Using `DataSource`, `Pipeline`, `TfmdList`, `TfmOver`, and `Transform`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from local.imports import *\n",
    "from local.test import *\n",
    "from local.core import *\n",

This file has been truncated. show original

Note some bits do still need updating but the library is being updated as we speak anyways and so things are bound to change rapidly for the time being (but hopefully slow down soon enough).

You can then also see this notebook:

github.com

fastai/fastai_dev/blob/master/dev/50_data_block.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#default_exp data.block"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#export\n",
    "from local.imports import *\n",
    "from local.test import *\n",

This file has been truncated. show original

for where the source code originated from and testing it

As well as the data core notebook which goes into the processing for the datablock

github.com

fastai/fastai_dev/blob/master/dev/05_data_core.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#default_exp data.core"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#export\n",
    "from local.imports import *\n",
    "from local.test import *\n",

This file has been truncated. show original

Borz · September 2, 2019, 5:49am

Heads up for v2: I think fastai v1 reverses shape when it reads run-length encoded (RLE) segmentation masks. I.e. (x,y) instead of (height,width).

Hmm, just realized this is probably intentional (vision.image.open_mask_rle)… hold on: does training a model on RLE-masks work in fastai v1? If the datapath is:

input RLE → convert to input tensor → Model→ output tensor → convert to RLE

then the model won’t get the correct tensors unless the dataloader knows to reverse the masks’s height & width. Most I could find was how to work on regular masks (course v3: lesson3-camvid). SegmentationItemList has a .label_from_func method → so then is the correct way to run fastai v1 on RLE masks to either:

decode all RLEs and save tensors to a folder?
or write a function to correctly decode the RLE into its mask tensor?

Any thoughts then on how fastai_v2 should decode RLE masks? (width, height) vs (height, width). I’m happy to write it w/ an argument to do either, but maybe that’s getting ahead of things a bit. Guessing that’d be right next to Segmentation masks in 07_vision_core.

Here’s what I’m talking about:

This came up during debugging in an RLE-mask intensive kaggle competition. Learning about masks has been … a journey .

nareshr8 · September 2, 2019, 6:14am

One thing I understood about GetAttr class is that it filters out the methods of defaut's class except the attributes with _ and any attributes specified in _xtra. Just wanted to know why someone will remove an existing functionality using that. For example, in the notebook, the test case fails to call upper method which is not in _xtra. I would like to understand the need for it. I believe, removing some functionalities of base class could make it less hackable. Is that the intend.

nareshr8 · September 2, 2019, 6:40am

I see _wif(worker_id) in 01a takes worker_id as argument. However, it is not used. I believe it has to do with making the pytorch result deterministic, should we add the value of worker_id to the info.seed and pass it to set_seed method.

tcapelle · September 2, 2019, 7:29am

Have anyone checked pytroch lightning ? They are growing super fast, they claim that “Lightning sets up all the boilerplate state-of-the-art training for you so you can focus on the research.”
Maybe we have some thing to get from them?

arora_aman · September 2, 2019, 9:02am

Inside 00_test.ipynb, can someone please tell me why do we do

class T(pd.Series): pass
test_eq(df1.iloc[0], T(df2.iloc[0]))

And not,

test_eq(df1.iloc[0], pd.Series(df2.iloc[0]))

Thanks

arora_aman · September 2, 2019, 9:09am

I am not saying that the way it’s written is wrong but just want to understand if this is the expected way to write tests. I am pretty new to the community so I learn a lot from fastai

nareshr8 · September 2, 2019, 9:09am

In latter, both are of same type Series, whereas above creates a subclass of series and compares. So, That check might be to ensure if test_eq works on super class subclass objects as well.

arora_aman · September 2, 2019, 9:10am

That makes absolute sense! Thank you

nareshr8 · September 2, 2019, 10:21am

I am geeting the following error when I try to Convert LettersDL() iteator to L in 01a notebook
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in
2 def create_batches(self, samps): return (string.ascii_lowercase[i:i+4] for i in range(0,26,4))
3
----> 4 test_eq(L(LettersDL()), ‘abcd,efgh,ijkl,mnop,qrst,uvwx,yz’.split(’,’))

~/ML/Fast-AI/fastai_dev/dev/local/core.py in _init(self, *args, **kwargs)
61 def _init(self,*args,**kwargs):
62 if self._newchk: return
—> 63 old_init(self, *args, **kwargs)
64
65 x.init,x.new = _init,_new

~/ML/Fast-AI/fastai_dev/dev/local/core.py in _init(self, *args, **kwargs)
34 def _init(self,*args,**kwargs):
35 self.pre_init()
—> 36 old_init(self, *args,**kwargs)
37 self.post_init()
38 setattr(x, ‘init’, _init)

~/ML/Fast-AI/fastai_dev/dev/local/core.py in init(self, items, use_list, match, rest)
242 if items is None: items = []
243 if (use_list is not None) or not isinstance(items,(Tensor,ndarray,pd.DataFrame,pd.Series)):
–> 244 items = list(items) if use_list else _listify(items)
245 if match is not None:
246 if len(items)==1: items = itemslen(match)

~/ML/Fast-AI/fastai_dev/dev/local/core.py in _listify(o)
218 if isinstance(o, list): return o
219 if isinstance(o, (str,np.ndarray,Tensor)): return [o]
–> 220 if is_iter(o): return list(o)
221 return [o]
222

in iter(self)
14 assert not kwargs and not (bs is None and drop_last)
15
—> 16 def iter(self): return _loadersself.fake_l.num_workers==0
17
18 def len(self):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/dataloader.py in init(self, loader)
335 class _SingleProcessDataLoaderIter(_BaseDataLoaderIter):
336 def init(self, loader):
–> 337 super(_SingleProcessDataLoaderIter, self).init(loader)
338 assert self._timeout == 0
339 assert self._num_workers == 0

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/utils/data/dataloader.py in init(self, loader)
301 def init(self, loader):
302 self._dataset = loader.dataset
–> 303 self._dataset_kind = loader._dataset_kind
304 self._auto_collation = loader._auto_collation
305 self._drop_last = loader.drop_last

~/ML/Fast-AI/fastai_dev/dev/local/core.py in getattr(self, k)
193 def getattr(self,k):
194 if k in self._xtra: return getattr(self.default, k)
–> 195 raise AttributeError(k)
196 def dir(self): return custom_dir(self, self._xtra)
197

AttributeError: _dataset_kind

Can someone help?

sgugger · September 2, 2019, 12:22pm

Guessing you don’t hav PyTorch v1.2 installed, can you check your version?

nareshr8 · September 2, 2019, 2:00pm

My version is 1.3.0a0+60f6cc9. I built it from Pytorch git master couple of days ago.

jeremy · September 3, 2019, 4:17am

jeremy · September 3, 2019, 4:17am

I’ll be doing daily online code walk-thrus of fastai v2 each day from tomorrow. Details here:

muellerzr · September 3, 2019, 6:04am

A note for colab users who want to dev: there’s no good way to run it on windows and have it upload really… I was having git permission issues with git bash and the script for nb-strip. If anyone is familiar with git on windows that could help figure out why, here’s the output:

C:\Users\muell\Documents\fastai_dev>git add *
python tools\\fastai-nbstripout -d: /c/Users/muell/AppData/Local/Microsoft/WindowsApps/python: Permission denied
error: external filter 'python tools\\fastai-nbstripout -d' failed 126
error: external filter 'python tools\\fastai-nbstripout -d' failed
fatal: dev/04_data_external.ipynb: clean filter 'fastai-nbstripout-docs' failed

For now I’m quickly running it on paperspace after finalizing my modifications on colab…

This also occurred when I chose to run as administrator

TomB · September 3, 2019, 8:40am

Look like it might be related to the new support for python in Windows, where it will take you to the windows store to install python. Check this and maybe try disabling the App execution alias, or check on your path that the correct python folder is first.

jeremy · September 3, 2019, 1:43pm

@muellerzr there’s not many things that need GPU so far in the NBs, so another option would be to just run on WSL directly.

muellerzr · September 3, 2019, 1:48pm

Thanks Jeremy! I will try that later today!

ilovescience · September 4, 2019, 12:14am

I had a couple questions after listening to the first fastai v2 walk-thru:

Why do we have functions that return functions like RandomSplitter. Wouldn’t it be more intuitive to have a class that can be called? Like in PyTorch, the nn.Modules can be called by using the forward method.
In terms of the special list behavior, you demonstrate interesting behavior like adding and multiplying to the lists.

For example, you demonstrate that:

In [1]: 5*a
Out[1]: (#20) [1,2,5,7,1,2,5,7,1,2... ]

But wouldn’t it be expected to return?:

In [1]: 5*a
Out[1]: (#4) [5,10,15,35]