I think you need to use np.lesser
here as you want to divide when there is no improvement (this is also a bug that I will look into, your code should work).
For all of fastai2
, fastcore
and fastprogress
? or just fastai2
?
np.less
still reduces the lr like np.greater
did.
I am trying to run the code in the intro of the book to just fiddle around. The following cell
from fastai2.text.all import *
dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
learn.fine_tune(4, 1e-2)
Is giving me this error, which is quite weird. I tried pulling and reinstalling fastai2, but to no avail
Could not do one pass in your dataloader, there is something wrong in it
The same thing happens later with
dls = TabularDataLoaders.from_csv(path/'adult.csv', path, y_names="salary",
cat_names = ['workclass', 'education', 'marital-status', 'occupation',
'relationship', 'race'],
cont_names = ['age', 'fnlwgt', 'education-num'],
procs = [Categorify, FillMissing, Normalize])
We need more info to debug whatâs going on. Try calling one_batch()
or show_batch()
one_batch
throws a IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
The stack trace is quite big not sure if you want me to post it here
We canât do much without seeing it either Wrap it in a wrapper like so:
â```pythonâ
â```â
(remove the quotation marks) and it wonât take up the whole screen
I have a very simple question that I still didnât managed to figure out.
I want to create a dataset when both images and labels are already in memory. for example:
imgs = ['img1', 'img2', 'img3'] # This are the actual arrays containing the imgs
lbls = ['dog', 'dog', 'cat']
I just want to created a Datasets
object from this.
I tried doing something like:
items = list(zip(imgs,targs))
dset = Datasets(items, tfms=[[lambda x: x[0]], [lambda x:x[1]]])
but this just ends up taking the first element from imgs
and targs
and both are considered inputs.
I also tried changing n_inp
without success.
Generally what Iâve been doing is to save the image and labels to disk and use Datasets
the more âtraditionalâ way, but I feel like itâs time to learn how to properly do this lol
EDIT: Possible solution:
dset = Datasets(range(len(imgs)), tfms=[[imgs.__getitem__], [lbls.__getitem__]])
---------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-2-90634fcc3c9e> in <module>
----> 1 dls.show_batch()
~/Repos/fastai2/fastai2/data/core.py in show_batch(self, b, max_n, ctxs, show, **kwargs)
88
89 def show_batch(self, b=None, max_n=9, ctxs=None, show=True, **kwargs):
---> 90 if b is None: b = self.one_batch()
91 if not show: return self._pre_show_batch(b, max_n=max_n)
92 show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)
~/Repos/fastai2/fastai2/data/load.py in one_batch(self)
128 def one_batch(self):
129 if self.n is not None and len(self)==0: raise ValueError(f'This DataLoader does not contain any batches')
--> 130 with self.fake_l.no_multiproc(): res = first(self)
131 if hasattr(self, 'it'): delattr(self, 'it')
132 return res
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/utils.py in first(x)
172 def first(x):
173 "First element of `x`, or None if missing"
--> 174 try: return next(iter(x))
175 except StopIteration: return None
176
~/Repos/fastai2/fastai2/data/load.py in __iter__(self)
95 self.randomize()
96 self.before_iter()
---> 97 for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
98 if self.device is not None: b = to_device(b, self.device)
99 yield self.after_batch(b)
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
343
344 def __next__(self):
--> 345 data = self._next_data()
346 self._num_yielded += 1
347 if self._dataset_kind == _DatasetKind.Iterable and \
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
383 def _next_data(self):
384 index = self._next_index() # may raise StopIteration
--> 385 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
386 if self._pin_memory:
387 data = _utils.pin_memory.pin_memory(data)
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
32 raise StopIteration
33 else:
---> 34 data = next(self.dataset_iter)
35 return self.collate_fn(data)
36
~/Repos/fastai2/fastai2/data/load.py in create_batches(self, samps)
104 self.it = iter(self.dataset) if self.dataset is not None else None
105 res = filter(lambda o:o is not None, map(self.do_item, samps))
--> 106 yield from map(self.do_batch, self.chunkify(res))
107
108 def new(self, dataset=None, cls=None, **kwargs):
~/Repos/fastai2/fastai2/data/load.py in do_batch(self, b)
125 def create_item(self, s): return next(self.it) if s is None else self.dataset[s]
126 def create_batch(self, b): return (fa_collate,fa_convert)[self.prebatched](b)
--> 127 def do_batch(self, b): return self.retain(self.create_batch(self.before_batch(b)), b)
128 def one_batch(self):
129 if self.n is not None and len(self)==0: raise ValueError(f'This DataLoader does not contain any batches')
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in __call__(self, o)
177 self.fs.append(t)
178
--> 179 def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
180 def __repr__(self): return f"Pipeline: {self.fs}"
181 def __getitem__(self,i): return self.fs[i]
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in compose_tfms(x, tfms, is_enc, reverse, **kwargs)
125 for f in tfms:
126 if not is_enc: f = f.decode
--> 127 x = f(x, **kwargs)
128 return x
129
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in __call__(self, x, **kwargs)
60 @property
61 def use_as_item(self): return ifnone(self.as_item_force, self.as_item)
---> 62 def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
63 def decode (self, x, **kwargs): return self._call('decodes', x, **kwargs)
64 def __repr__(self): return f'{self.__class__.__name__}: {self.use_as_item} {self.encodes} {self.decodes}'
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
72 f = getattr(self, fn)
73 if self.use_as_item or not is_listy(x): return self._do_call(f, x, **kwargs)
---> 74 res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
75 return retain_type(res, x)
76
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in <genexpr>(.0)
72 f = getattr(self, fn)
73 if self.use_as_item or not is_listy(x): return self._do_call(f, x, **kwargs)
---> 74 res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
75 return retain_type(res, x)
76
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
76
77 def _do_call(self, f, x, **kwargs):
---> 78 return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
79
80 add_docs(Transform, decode="Delegate to `decodes` to undo transform", setup="Delegate to `setups` to set up transform")
~/anaconda3/envs/fastai2/lib/python3.7/site-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
96 if not f: return args[0]
97 if self.inst is not None: f = MethodType(f, self.inst)
---> 98 return f(*args, **kwargs)
99
100 def __get__(self, inst, owner):
~/Repos/fastai2/fastai2/text/data.py in pad_input_chunk(samples, pad_idx, pad_first, seq_len)
132 # Cell
133 def pad_input_chunk(samples, pad_idx=1, pad_first=True, seq_len=72):
--> 134 max_len = max([len(s[0]) for s in samples])
135 def _f(x):
136 l = max_len - x.shape[0]
~/Repos/fastai2/fastai2/text/data.py in <listcomp>(.0)
132 # Cell
133 def pad_input_chunk(samples, pad_idx=1, pad_first=True, seq_len=72):
--> 134 max_len = max([len(s[0]) for s in samples])
135 def _f(x):
136 l = max_len - x.shape[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
Not convinced itâs particularly useful. Seems like the problem is that the data is not there
I have a hunch. The symptom is similar to what was happening here
Judging from the fix that was merged, it looks like something bad in _one_pass
is happening and the exception is being thrown before
res._n_inp,res._types = self._n_inp,self._types
is executed (cf. https://github.com/fastai/fastai2/blob/bde69dd1bac5ba23403174dcce93a34a64a2a8f5/fastai2/data/core.py#L59-L63)
Hard to tell what the problem is, since the exception is being swallowed, but it might be that moving
res._n_inp,res._types = self._n_inp,self._types
into a finally:
might work
By the way, useful bit of knowledge for a test_dl
. If itâs labeled (for anything other than say tabular) and we have our ground truth labels set up (similar to @lgvaz problem above), we can do the following:
imgs = [im1, im2, im3] # These are PILImages
lbls = [0, 0, 1]
dl = dls.test_dl(zip(imgs, lbls), with_labels=True, rm_type_tfms=1)
We can simply pass them in But it should be one hot encoded first (or however your yâs are expected)
(this can be as simple as:
o2i = dls.vocab.o2i
lbls = [o2i[lbl] for lbl in lbls]
Edit: or as @lgvaz itâs literally as simple as:
dset = Datasets(range(len(imgs)), tfms=[[imgs.__getitem__], [lbls.__getitem__]])
(to which then you can just do: dl = dls.valid.new(dset)
)
Given the fact that the code works for all our reviewers, I think this is more of a setup problem. Make sure you have the latest version of fastai2 and fastcore, and tray again. If it still fails, remove the imdb directory and the imdb_tok directory and try again.
removing and rebuilding conda environment did the trick. I must have messed up the dependencies somehow when I pulled/updated the latest fastai2
Thanks
Installing fastai2 for the first time today - I also got this error
I require a particular label NA
to always be the last one. Iâm doing it as follows:
dls = ImageDataLoaders.from_df(data_df, path='/',
seed=1234,
item_tfms=item_tfms,
batch_tfms=batch_tfms)
dls.dataset.vocab # = (#5) ['NA', 'aerial','eyelevel','high','low']
new_vocab = L(*dls.dataset.vocab[1:], dls.dataset.vocab[0])
# = (#5) ['aerial','eyelevel','high','low','NA']
dls.dataset.vocab = new_vocab
Is it safe to change the order of the dataset.vocab
labels after creating it?
I noticed dls.categorize()
could be an option but that doesnât change the order. Curiously, setting add_na=True
for dls.categorize
throws the following error:
TypeError: encodes() got an unexpected keyword argument 'add_na'
Thanks!
Hi all,
I am working with a multi label classifier and I have defined the learner as follows:
opt_func = partial(Adam, wd=0.1)
learner = text_classifier_learner(data,
AWD_LSTM,
model_dir=model_dir,
loss_func=BCEWithLogitsLossFlat(),
metrics=[F1ScoreMulti()],
path=path,
opt_func=opt_func,
cbs=metrics_callback).to_fp16()
Adding to_fp16
to train the model in mixed precision.
Then, I train the learner for a few cycles and save the weights.
learner.fit_one_cycle(cycles, lr, moms)
learner.save(save_name)
When doing fit_one_cycle
I want to save the weights of each cycle too.
Therefore, I save the weights at the end of each epoch, defining the following function in my Callback:
class MyCallback(Callback):
def after_epoch(self, **kwargs) -> None:
self.learn.save(cycle_save_name)
My doubt comes now. When I save the weights of each cycle, these files weigh 40MB less than the final weights of the learner.
I have checked both files (cycle weights and final learner weights), and I have noticed that the weights of the cycles are saved in torch.float16
, while the final weights are saved in torch.float32
.
- Why does this happen if I am initializing the
Learner
withMixedPrecision
? - Will I have problems loading any of the two weights?
Thanks!
@rsomani95 Itâs probably safer to use the data block API and specify your vocab by passing it to the CategoryBlock
@Saioa In mixed precision, the learner is set to FP16 at the beginning of training and back to FP32 at the end of training (so that you can save your model in full precision). You should load the learner you want while in the same enviromnent then save it again in full precision because you canât do inference on the CPU in FP16.
Ahh, many months of wait
Is there a way to attend your in-person course online? I am not in the Bay Area.
Invites were sent out about a month ago based on certain criteria. If you were not a part of it, highly recommend Jeremyâs code walk throughs or my Study group until the new course comes out in Julyish to the public
Using the show_training_loop
function I get the following error:
File "/home/ubuntu/fastai-nlp/fastai_nlp/multi_label_classifier.py", line 114, in _get_multi_label_learner
print(learner.show_training_loop())
File "/home/ubuntu/clones/fastai2/fastai2/learner.py", line 233, in show_training_loop
if dl is None: dl = self.dls[ds_idx].new(shuffle=shuffle)
NameError: name '_loop' is not defined
I have investigated a bit and I have seen that in previous commits there was a definition of _loop
in
fastai2 / fastai2 / learner.py that in the most recent code is not:
# Cell
_loop = ['Start Fit', 'begin_fit', 'Start Epoch Loop', 'begin_epoch', 'Start Train', 'begin_train',
'Start Batch Loop', 'begin_batch', 'after_pred', 'after_loss', 'after_backward',
'after_step', 'after_cancel_batch', 'after_batch','End Batch Loop','End Train',
'after_cancel_train', 'after_train', 'Start Valid', 'begin_validate','Start Batch Loop',
'**CBs same as train batch**', 'End Batch Loop', 'End Valid', 'after_cancel_validate',
'after_validate', 'End Epoch Loop', 'after_cancel_epoch', 'after_epoch', 'End Fit',
'after_cancel_fit', 'after_fit']
If I add this piece of code in fastai2 / fastai2 / learner.py the error is fixed.
Is this the right way for show_training_loop
to work?
Thanks!