Fastai v2 chat

That’s exactly what that’s doing… that’s also what learn.export does… none of the original data is saved and used on the DataLoader…

If it’s not, what version of fastai are you using? (as this was an issue that got fixed)

1 Like

Ahh… This is awesome love that patch freaken amazing how flexible Fastai can be this worked like a charm thank you will continue developing with it Amazing!!! Thank you @muellerzr

1 Like

I don’t know If I should ask this here.

But I’m getting the below error even when I’m not passing my y variable in the cat_names parameter.

    *from fastai.tabular.all import **
    *input_df = input_df.astype('category')*
    *columns = input_df.columns*
    *features = columns.drop('cohort_flag')*
    *dls = TabularDataLoaders.from_df(input_df, y_names="cohort_flag",*
    *    cat_names = list(features), procs = [Normalize], bs=32)*

    learn = tabular_learner(dls, metrics=accuracy)

And the full stack trace is below:

   AttributeError                            Traceback (most recent call last)

<ipython-input-60-4b53f3a1cac0> in <module>()
      6     cat_names = list(features), procs = [Normalize], bs=32)
      7 
----> 8 learn = tabular_learner(dls, metrics=accuracy)

/usr/local/lib/python3.7/dist-packages/fastai/tabular/learner.py in tabular_learner(dls, layers, emb_szs, config, n_out, y_range, **kwargs)
     27     if layers is None: layers = [200,100]
     28     to = dls.train_ds
---> 29     emb_szs = get_emb_sz(dls.train_ds, {} if emb_szs is None else emb_szs)
     30     if n_out is None: n_out = get_c(dls)
     31     assert n_out, "`n_out` is not defined, and could not be inferred from data, set `dls.c` or pass `n_out`"

/usr/local/lib/python3.7/dist-packages/fastai/tabular/model.py in get_emb_sz(to, sz_dict)
     23 def get_emb_sz(to, sz_dict=None):
     24     "Get default embedding size from `TabularPreprocessor` `proc` or the ones in `sz_dict`"
---> 25     return [_one_emb_sz(to.classes, n, sz_dict) for n in to.cat_names]
     26 
     27 # Cell

/usr/local/lib/python3.7/dist-packages/fastai/tabular/model.py in <listcomp>(.0)
     23 def get_emb_sz(to, sz_dict=None):
     24     "Get default embedding size from `TabularPreprocessor` `proc` or the ones in `sz_dict`"
---> 25     return [_one_emb_sz(to.classes, n, sz_dict) for n in to.cat_names]
     26 
     27 # Cell

/usr/local/lib/python3.7/dist-packages/fastcore/basics.py in __getattr__(self, k)
    386         if self._component_attr_filter(k):
    387             attr = getattr(self,self._default,None)
--> 388             if attr is not None: return getattr(attr,k)
    389         raise AttributeError(k)
    390     def __dir__(self): return custom_dir(self,self._dir())

/usr/local/lib/python3.7/dist-packages/fastcore/transform.py in __getattr__(self, k)
    202     def __getitem__(self,i): return self.fs[i]
    203     def __setstate__(self,data): self.__dict__.update(data)
--> 204     def __getattr__(self,k): return gather_attrs(self, k, 'fs')
    205     def __dir__(self): return super().__dir__() + gather_attr_names(self, 'fs')
    206 

/usr/local/lib/python3.7/dist-packages/fastcore/transform.py in gather_attrs(o, k, nm)
    163     att = getattr(o,nm)
    164     res = [t for t in att.attrgot(k) if t is not None]
--> 165     if not res: raise AttributeError(k)
    166     return res[0] if len(res)==1 else L(res)
    167 

AttributeError: classes

You need to pass in Categorify to your procs, Normalize is only for continuous variables. You may also need FillMissing as well. IE:

procs = [Categorify, FillMissing, Normalize]

1 Like

@muellerzr Thanks for your help ! It worked !

Hi! A year later I’ve come across the same issue :sweat_smile:
Did you ever find time for this project? Or are you aware of anyone else doing it? Thanks anyway!

I never wound up getting to it sadly, but if you make a post on it we can work through your issues :smiley:

1 Like

For the moment I’ve found this:

I’m still reading through the details, but at least it contains relevant stuff like SentencePiece tokenization, and the reported metrics are really good, so I’ll definitely give this a try.

Thanks for sharing that, @florianl !

1 Like

I am trying to train a segmentation algorithm with FastAi. I have training and validation data in separate folders, so was planning on using GrandparentSplitter() but for some reason the validation set is empty.

My files are organised as below:

Path ---> train ---> images
                ---> masks
     ---> valid ---> images
                ---> masks

And this is how I set up my datablock and dataloader:

codes = np.array(['background', 'prostate'])

def label_func(x): return path/'train/masks'/f'{x.stem}_mask.png'

db = DataBlock(blocks=(ImageBlock(), MaskBlock(codes)),
              splitter=GrandparentSplitter(train_name='train', valid_name='valid'),
              get_items=get_image_files,
              get_y=label_func)

dls = db.dataloaders(path/'train/images', bs=1)
dls.show_batch()

I am assuming there is something wrong with how I organised the files.

I guess, You didnt input the ‘valid’ data into the dataloaders… I too dont know the answer, but from the code, I could see, dls has access to only images from the train folder.

Greetings to all code worriers! I’d like to change the metric when training a binary image classifying model from accuracy to false negative rate. I wonder if you could help how to take into account the number of false nagatives after each epoch as a metric. I read through documentation on Metrics/Callbacks however wasn’t much of help much.
Cheers :slight_smile: