Text_classifier_learner give error Could not infer dtype of NoneType

alibaltschun · November 26, 2018, 6:37am

hello, i have a problem with fastai.text
while create model for LM work fine, but while create classifier get some error like this

~/.local/lib/python3.6/site-packages/fastai/torch_core.py in tensor(x, *rest)
     68     # XXX: Pytorch bug in dataloader using num_workers>0; TODO: create repro and report
     69     if is_listy(x) and len(x)==0: return tensor(0)
---> 70     return torch.tensor(x) if is_listy(x) else as_tensor(x)
     71 
     72 def np_address(x:np.ndarray)->int:

RuntimeError: Could not infer dtype of NoneType

i was tried using num_workers=0 at TextClasDataBunch, but that not solve the problem and give same error

and this my code
fastai version : 1.0.28

data_lm = TextLMDataBunch.from_csv(PATH_DATASET, 'datatrain-en.csv')
data_clas = TextClasDataBunch.from_csv(PATH_DATASET, 'datatrain-en.csv',  vocab=data_lm.train_ds.vocab, bs=32)

MODEL = "https://s3.amazonaws.com/fast-ai-modelzoo/wt103"
learn = language_model_learner(data_lm, pretrained_model=MODEL, drop_mult=0.5)
learn.fit_one_cycle(1, 1e-2)
learn.unfreeze()
learn.fit_one_cycle(1, 1e-3)
learn.save_encoder('ft_enc')

learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('ft_enc')
learn.fit_one_cycle(1, 1e-2)

seb0 · November 26, 2018, 9:59am

I think your data is not in the correct format. Did you put the label in the expected position?
I can’t really see, because I don’t have your dataset (and I don’t want it, please just have a look by creating a pandas df and checking df.head()).

On another note: If you want people to help you, I suggest putting more work into formatting your code. Looks like you were pretty lazy and just copy&pasted stuff in here, which makes it hard to follow what happened

alibaltschun · November 26, 2018, 10:24am

thanks for helping.

my dataset on csv format with label and text columns.
i was tried using header and non haeder in my csv but still get the same error.

my csv :

df = pd.DataFrame.from_csv(PATH_DATASET/'datatrain-en.csv',header=-1,index_col=False)
df.columns=['label','text']
df.label.value_counts()

3    18442
2    18346
5    10294
1     1479
0      923
Name: label, dtype: int64

df.head()
|label|text|
|0|1|belfile upload tuga revisi fg|
|1|1|file upload tuga pre fg|
|2|1|file upload sisfo tuga|
|3|1|secur alert link googl account|
|4|1|secur alert link googl account|

seb0 · November 26, 2018, 10:28am

Yeah I’m not sure, but it looks like x == None at some point. Just clone the repo, install from local and debug using breakpoints. Thats what I would do

alibaltschun · November 26, 2018, 10:34am

while learn fitting that running on training process (first loading bar) work fine,and Interrupted on validating process (second loading bar) on same epoch.

alibaltschun · November 29, 2018, 5:25am

this problem coz i have some issue on my databunch. so i was fixing my dataset from csv into imagenet style. and load my data from a folder, and solve