That’s exactly what that’s doing… that’s also what learn.export does… none of the original data is saved and used on the DataLoader…
If it’s not, what version of fastai are you using? (as this was an issue that got fixed)
That’s exactly what that’s doing… that’s also what learn.export does… none of the original data is saved and used on the DataLoader…
If it’s not, what version of fastai are you using? (as this was an issue that got fixed)
Ahh… This is awesome love that patch freaken amazing how flexible Fastai can be this worked like a charm thank you will continue developing with it Amazing!!! Thank you @muellerzr
I don’t know If I should ask this here.
But I’m getting the below error even when I’m not passing my y variable in the cat_names parameter.
*from fastai.tabular.all import **
*input_df = input_df.astype('category')*
*columns = input_df.columns*
*features = columns.drop('cohort_flag')*
*dls = TabularDataLoaders.from_df(input_df, y_names="cohort_flag",*
* cat_names = list(features), procs = [Normalize], bs=32)*
learn = tabular_learner(dls, metrics=accuracy)
And the full stack trace is below:
AttributeError Traceback (most recent call last)
<ipython-input-60-4b53f3a1cac0> in <module>()
6 cat_names = list(features), procs = [Normalize], bs=32)
7
----> 8 learn = tabular_learner(dls, metrics=accuracy)
/usr/local/lib/python3.7/dist-packages/fastai/tabular/learner.py in tabular_learner(dls, layers, emb_szs, config, n_out, y_range, **kwargs)
27 if layers is None: layers = [200,100]
28 to = dls.train_ds
---> 29 emb_szs = get_emb_sz(dls.train_ds, {} if emb_szs is None else emb_szs)
30 if n_out is None: n_out = get_c(dls)
31 assert n_out, "`n_out` is not defined, and could not be inferred from data, set `dls.c` or pass `n_out`"
/usr/local/lib/python3.7/dist-packages/fastai/tabular/model.py in get_emb_sz(to, sz_dict)
23 def get_emb_sz(to, sz_dict=None):
24 "Get default embedding size from `TabularPreprocessor` `proc` or the ones in `sz_dict`"
---> 25 return [_one_emb_sz(to.classes, n, sz_dict) for n in to.cat_names]
26
27 # Cell
/usr/local/lib/python3.7/dist-packages/fastai/tabular/model.py in <listcomp>(.0)
23 def get_emb_sz(to, sz_dict=None):
24 "Get default embedding size from `TabularPreprocessor` `proc` or the ones in `sz_dict`"
---> 25 return [_one_emb_sz(to.classes, n, sz_dict) for n in to.cat_names]
26
27 # Cell
/usr/local/lib/python3.7/dist-packages/fastcore/basics.py in __getattr__(self, k)
386 if self._component_attr_filter(k):
387 attr = getattr(self,self._default,None)
--> 388 if attr is not None: return getattr(attr,k)
389 raise AttributeError(k)
390 def __dir__(self): return custom_dir(self,self._dir())
/usr/local/lib/python3.7/dist-packages/fastcore/transform.py in __getattr__(self, k)
202 def __getitem__(self,i): return self.fs[i]
203 def __setstate__(self,data): self.__dict__.update(data)
--> 204 def __getattr__(self,k): return gather_attrs(self, k, 'fs')
205 def __dir__(self): return super().__dir__() + gather_attr_names(self, 'fs')
206
/usr/local/lib/python3.7/dist-packages/fastcore/transform.py in gather_attrs(o, k, nm)
163 att = getattr(o,nm)
164 res = [t for t in att.attrgot(k) if t is not None]
--> 165 if not res: raise AttributeError(k)
166 return res[0] if len(res)==1 else L(res)
167
AttributeError: classes
You need to pass in Categorify to your procs, Normalize is only for continuous variables. You may also need FillMissing as well. IE:
procs = [Categorify, FillMissing, Normalize]
@muellerzr Thanks for your help ! It worked !
Hi! A year later I’ve come across the same issue
Did you ever find time for this project? Or are you aware of anyone else doing it? Thanks anyway!
I never wound up getting to it sadly, but if you make a post on it we can work through your issues
For the moment I’ve found this:
I’m still reading through the details, but at least it contains relevant stuff like SentencePiece tokenization, and the reported metrics are really good, so I’ll definitely give this a try.
Thanks for sharing that, @florianl !
I am trying to train a segmentation algorithm with FastAi. I have training and validation data in separate folders, so was planning on using GrandparentSplitter() but for some reason the validation set is empty.
My files are organised as below:
Path ---> train ---> images
---> masks
---> valid ---> images
---> masks
And this is how I set up my datablock and dataloader:
codes = np.array(['background', 'prostate'])
def label_func(x): return path/'train/masks'/f'{x.stem}_mask.png'
db = DataBlock(blocks=(ImageBlock(), MaskBlock(codes)),
splitter=GrandparentSplitter(train_name='train', valid_name='valid'),
get_items=get_image_files,
get_y=label_func)
dls = db.dataloaders(path/'train/images', bs=1)
dls.show_batch()
I am assuming there is something wrong with how I organised the files.
I guess, You didnt input the ‘valid’ data into the dataloaders… I too dont know the answer, but from the code, I could see, dls has access to only images from the train folder.
Greetings to all code worriers! I’d like to change the metric when training a binary image classifying model from accuracy to false negative rate. I wonder if you could help how to take into account the number of false nagatives after each epoch as a metric. I read through documentation on Metrics/Callbacks however wasn’t much of help much.
Cheers