Error with TextClasDataBunch.from_df() in Google Colab

Hi all,

Since +/- yesterday, the error below started to occur in Google Colab with the following code:

data_clas = TextClasDataBunch.from_df(path,
train_df= df[train_bool],
valid_df= df[~train_bool],
tokenizer=tokenizer,
text_cols=0,
bs=24,
vocab=data_lm.vocab,
max_vocab=35000,
label_cols=1)

BrokenProcessPool Traceback (most recent call last)
in ()
7 vocab=data_lm.vocab,
8 max_vocab=35000,
----> 9 label_cols=1)

10 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in from_df(cls, path, train_df, valid_df, test_df, tokenizer, vocab, classes, text_cols, label_cols, label_delim, chunksize, max_vocab, min_freq, mark_fields, include_bos, include_eos, **kwargs)
202 else:
203 if label_delim is not None: src = src.label_from_df(cols=label_cols, classes=classes, label_delim=label_delim)
–> 204 else: src = src.label_from_df(cols=label_cols, classes=classes)
205 if test_df is not None: src.add_test(TextList.from_df(test_df, path, cols=text_cols))
206 return src.databunch(**kwargs)

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in _inner(*args, **kwargs)
475 self.valid = fv(*args, from_item_lists=True, **kwargs)
476 self.class = LabelLists
–> 477 self.process()
478 return self
479 return _inner

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self)
529 “Process the inner datasets.”
530 xp,yp = self.get_processors()
–> 531 for ds,n in zip(self.lists, [‘train’,‘valid’,‘test’]): ds.process(xp, yp, name=n)
532 #progress_bar clear the outputs so in some case warnings issued during processing disappear.
533 for ds in self.lists:

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, xp, yp, name)
709 p.warns = []
710 self.x,self.y = self.x[~filt],self.y[~filt]
–> 711 self.x.process(xp)
712 return self
713

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, processor)
81 if processor is not None: self.processor = processor
82 self.processor = listify(self.processor)
—> 83 for p in self.processor: p.process(self)
84 return self
85

/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in process(self, ds)
294 tokens = []
295 for i in progress_bar(range(0,len(ds),self.chunksize), leave=False):
–> 296 tokens += self.tokenizer.process_all(ds.items[i:i+self.chunksize])
297 ds.items = tokens
298

/usr/local/lib/python3.6/dist-packages/fastai/text/transform.py in process_all(self, texts)
118 if self.n_cpus <= 1: return self._process_all_1(texts)
119 with ProcessPoolExecutor(self.n_cpus) as e:
–> 120 return sum(e.map(self._process_all_1, partition_by_cores(texts, self.n_cpus)), [])
121
122 class Vocab():

/usr/lib/python3.6/concurrent/futures/process.py in _chain_from_iterable_of_lists(iterable)
364 careful not to keep references to yielded objects.
365 “”"
–> 366 for element in iterable:
367 element.reverse()
368 while element:

/usr/lib/python3.6/concurrent/futures/_base.py in result_iterator()
584 # Careful not to keep a reference to the popped future
585 if timeout is None:
–> 586 yield fs.pop().result()
587 else:
588 yield fs.pop().result(end_time - time.monotonic())

/usr/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
–> 432 return self.__get_result()
433 else:
434 raise TimeoutError()

/usr/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
–> 384 raise self._exception
385 else:
386 return self._result

BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

The first cell in the notebook contains:

!curl -s https://course.fast.ai/setup/colab | bash

Is it possible any update could have caused this issue?

Thanks in advance,
Monique

Any update? How can I add this issue to GitHub?

How large of a dataset were you using? Were you getting RAM warnings? I moved my text work over to paperspace because they use a lot of ram in tokenizing (not sure if that was adjusted or not as of late)

How large of a dataset were you using?

It’s a 78MB Excel file.

Were you getting RAM warnings?
No.

I moved my text work over to paperspace because they use a lot of ram in tokenizing (not sure if that
was adjusted or not as of late)

BTW, did you test GCP? (I had CUDA memory errors in GCP, for an yet smaller dataset…)

I have a problem with text in Colab too. In kaggle it works well, and it looks like both use P100 gpu.

TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
full error
nb

I tried to lower BS and set CUDA_LAUNCH_BLOCKING to 1 to see if there’s other error. Kaggle uses fastai 1.0.59

=== Software === 
python        : 3.6.9
fastai        : 1.0.60
fastprogress  : 0.2.2
torch         : 1.3.1
nvidia driver : 418.67
torch cuda    : 10.1.243 / is available
torch cudnn   : 7603 / is enabled

=== Hardware === 
nvidia gpus   : 1
torch devices : 1
  - gpu0      : 16280MB | Tesla P100-PCIE-16GB

=== Environment === 
platform      : Linux-4.14.137+-x86_64-with-Ubuntu-18.04-bionic
distro        : #1 SMP Thu Aug 8 02:47:02 PDT 2019
conda env     : Unknown
python        : /usr/bin/python3
sys.path      : 
/env/python
/usr/lib/python36.zip
/usr/lib/python3.6
/usr/lib/python3.6/lib-dynload
/usr/local/lib/python3.6/dist-packages
/usr/lib/python3/dist-packages
/usr/local/lib/python3.6/dist-packages/IPython/extensions
/root/.ipython

Tue Jan 14 19:02:09 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   48C    P0    35W / 250W |  13527MiB / 16280MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Update
Seems the problem is in SaveModelCallback defined as
callback_fns=[partial(SaveModelCallback, monitor="perplexity", mode="min",name=“best_model”),

Hi, I am also facing the same issue with TextClasDataBunch on Google Colab. What is the solution? I tried setting defaults.cpus=1, as suggested in some posts, to make the operation singe threaded but not finding this useful.

Thanks and Regards,
-Hari