Yes. It will also use max_len
to not backpropagate after that point like in v1. The main thing new from v1 is that it completely ignores the padding at the beginning and starts with a clean hidden state for each sequence in the batch.
Thanks!
Glad to be wrong
Hi!
Did you find a solution? I agree that regression seems to be omitted. In theory it can be overcome via applying simple ToTensor() transformation, but it did not work for me. So, I am curious if there is an easy fix.
There is something different. Maybe is the new way of dealing with sequences. I tested 3 models I had running with fastaiv1 text and all of them, using the same parameters, perform 1.5 or 2 percentage points lower in fastaiv2 in terms of accuracy (even IMDB) using the 0.0.6 version. Inspecting the dataloader I notice some texts end up with some padding not only in the beggining of the sequence but also in some pad tokens in end of the sequence. Is it expected?
No, I havenât figured it out. I went through TransformBlock API and tried actual TransformBlock
too.
@sgugger Please confirm if the multi label regression is available with the current fastai2 API (low or high)
TransformBlock
should work as it should leave the targets as floats.
@sgugger Inspecting the dataloader (iter) I notice that pad_input_chunk adds some pad tokens not only in the beggining of the sequence but also in the end of the sequence. Is it expected?
Yes. A sequence needs to begin at a round multiple of seq_len
(otherwise the RNN is going to see some pad tokens that make no sense to it), so to do this, there is a little bit of padding at the end (that is then ignored in the masked concat pool).
Thanks!
But how to initialize it properly? When I do the following targ
is a tuple
dbch = DataBlock(blocks=(TextBlock.from_df(vocab=lm_vocab, text_cols="text"), TransformBlock),
get_x=ColReader('text'),
get_y=ColReader(labels),
splitter=RandomSplitter(0.2, seed=42),
dl_type=SortedDL).databunch(df_tok, home, bs=128)
241 def __call__(self, inp, targ, **kwargs):
242 inp = inp .transpose(self.axis,-1).contiguous()
--> 243 targ = targ.transpose(self.axis,-1).contiguous()
244 if self.floatify and targ.dtype!=torch.float16: targ = targ.float()
245 if targ.dtype in [torch.int8, torch.int16, torch.int32]: targ = targ.long()
AttributeError: 'tuple' object has no attribute 'transpose'
Thanks for the information!
I loaded up a databunch:
bs = 64
imdb_lm = DataBlock(blocks=(TextBlock.from_df(âtextâ, is_lm=True),),
get_x=attrgetter(âtextâ),
splitter=RandomSplitter())
dbunch = imdb_lm.databunch(df, bs=bs, seq_len=72)
Which consumed about 45GB of RAM.
Showing the batch worked as expected:
dbunch.show_batch(max_n=6)
Then I tried torch.save:
torch.save(dbunch, âdbunch.pklâ)
RAM usage jumped up another 30GB, then it failed with this:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-02e721dd8bbd> in <module>
----> 1 torch.save(dbunch, 'dbunch.pkl')
~/environments/fastai2/lib/python3.6/site-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol)
258 >>> torch.save(x, buffer)
259 """
--> 260 return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
261
262
~/environments/fastai2/lib/python3.6/site-packages/torch/serialization.py in _with_file_like(f, mode, body)
183 f = open(f, mode)
184 try:
--> 185 return body(f)
186 finally:
187 if new_fd:
~/environments/fastai2/lib/python3.6/site-packages/torch/serialization.py in <lambda>(f)
258 >>> torch.save(x, buffer)
259 """
--> 260 return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
261
262
~/environments/fastai2/lib/python3.6/site-packages/torch/serialization.py in _save(obj, f, pickle_module, pickle_protocol)
330 pickler = pickle_module.Pickler(f, protocol=pickle_protocol)
331 pickler.persistent_id = persistent_id
--> 332 pickler.dump(obj)
333
334 serialized_storage_keys = sorted(serialized_storages.keys())
AttributeError: Can't pickle local object 'ReindexCollection.__init__.<locals>._get'
This is with a 2.5GB csv file. I get the same error when I truncate that file to only 64 rows, but I wanted to flag the RAM usage as well. Not sure if itâs expected to use 75GB of ram to process a 2.5GB csv.
Thanks!
Update: It doesnât actually take the full hour to load the databunch. After it runs for a few minutes, the estimate goes down significantly, then it finishes well ahead of schedule. I didnât time it, but it probably took 5-10 minutes or so.
Will look at that. The aim is to have those objects pickle so this is a bug.
Thanks! Regarding memory usage, in-case it helps, hereâs a comparison with the same data on fastai and fastai2:
Creating Databunch:
Fastai 1.0.59: 30GB
Fastai2 0.0.6: 45GB
Additional memory used to save Databunch:
Fastai 1.0.59: 21GB
Fastai2 0.0.6: 30GB
Yes, got the same error.
@sgugger Another note - the classification learner hard codes cross entropy loss. So far I awa overcoming it via modifying the method. Would it make sense to adjust it? I do not think it worth a pull request to the repoâŚ
@Cl_78_v you should use TextLearner instead I believe.
Yes, it has loss function in signature, thanks. The initiation did not work for me, may look later into this (getting a side error).
Ok, the bug for pickling has been fixed. LMDataLoader should pickle now (use fastcore master for the fix).
I am still trying to understand how to set dataloader for multi label regression, and I am stuck.
Hereâs what I do, but it seems the API doesnât handle each target properly. Do I need to add a block which handles all float targets?
dbch = DataBlock(blocks=(TextBlock.from_df(vocab=lm_vocab, text_cols="text"), TransformBlock),
get_x=ColReader('text'),
get_y=ColReader(labels),
splitter=RandomSplitter(0.2, seed=42),
dl_type=SortedDL).dataloaders(df_tok, home, bs=128)
learn = text_classifier_learner(dbch, AWD_LSTM, metrics=[accuracy], path=home,
loss_func=CrossEntropyLossFlat(),
cbs=[EarlyStoppingCallback(monitor='accuracy', min_delta=0.01, comp=np.greater, patience=5),
SaveModelCallback(monitor="accuracy", comp=np.greater, fname="best_model")]
).to_fp16()
learn.load_encoder("ft_enc_v2");