I am getting the following error while trying to train the model. I am running the code in the GPU. Also I am facing another problem during fine tunning. It is taking a lot time for fine tuned the pre-trained model. Any suggestion to make the process more faster? P.S. I have loaded cuda before running the process. And check GPU by nvidia-smi.
: /localtmp/fs5ve/fastai/courses/dl2/imdb_scripts ; python train_clas.py data/wiki/en 0 --lm-id pretrain_wt103 --clas-id pretrain_wt103 --cl 5
dir_path data/wiki/en; cuda_id 0; lm_id pretrain_wt103; clas_id pretrain_wt103; bs 64; cl 5; backwards False; dropmult 1.0 unfreeze True startat 0; bpe False; use_clr True;use_regular_schedule False; use_discriminative True; last False;chain_thaw False; from_scratch False; train_file_id
Traceback (most recent call last):
File "train_clas.py", line 148, in <module>
if __name__ == '__main__': fire.Fire(train_clas)
File "/zf18/fs5ve/.conda/envs/fastai/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/zf18/fs5ve/.conda/envs/fastai/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/zf18/fs5ve/.conda/envs/fastai/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train_clas.py", line 51, in train_clas
assert trn_lbls.shape[1] == 1 and val_lbls.shape[1] == 1, 'This classifier uses cross entropy loss and only support single label samples'
IndexError: tuple index out of range
(fastai)
Could you include some code and what the data looks like?
I believe the problem is with your data. What are you using to create your data loader? Is it ModelData(... or ImageClassifierData. ... or something else?
To find your error check your code with the python debugger:
You can use the python debugger pdb to step through code.
ā¢ pdb.set_trace() to set a breakpoint
ā¢ %debug magic to trace an error
Commands you need to know:
ā¢ s - step: execute and step into function
ā¢ n - next: execute current line
ā¢ c - continue: continue execution until next breakpoint
ā¢ u - up: move one level up in the stack trace
ā¢ d - down: move one level down in the stack trace
ā¢ p - print: print variable, example: āp xā prints variable x
ā¢ l - list: lists 11 lines of code around the current line
Jeremy shows in lesson 8 how to use it.
To learn how to use pdb will pay off big times in the long run.
If you run into problems with code moved to the gpu run it on the cpu or use this to get more meaningful information during debugging of cuda errors (from Lernapparat - Machine Learning):
import os
os.environ[āCUDA_LAUNCH_BLOCKINGā] = ā1ā
Is there any way the code which is uploaded in the github is not using GPU ?
what does the following code do ? As far as I know it will help me by providing more details during debug.
import os
os.environ[āCUDA_LAUNCH_BLOCKINGā] = ā1ā
Is there any way to make my finetuning process more faster?
Better use the pdb.set_trace() command to be much more flexible.
Then you can easily print all the variables with āp varā including their shape with āp var.shapeā, etc.
Trust me, usually a few print statements are not enough and in the end you have more lines with print than with the actual code.
I think from the shape we should be able to tell what is wrongā¦ and for @faysalhossain2007 it is easier to edit the file then to learn to navigate there with the debugger.
But it is indeed better to use the debugger most the time I agree
You should have opened a new threadā¦ that way people would have answered faster. The problem is that your model is not on the gpu. When you make the model you should call .cuda() on it.