'bool' object is not callable for TextList data block

I’m having issues with the new data block api.

Here is the call I’m trying to use:

data_clas = (TextList.from_folder(path, vocab=data_lm.vocab)
            .split_by_folder(valid='valid')
            .label_from_folder(classes=['neg','pos'])
            .filter_missing_y()
            .databunch(bs=bs))

and this is my error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-138-cffcd242252b> in <module>
      1 data_clas = (TextList.from_folder(path, vocab=data_lm.vocab)
      2             .split_by_folder(valid='valid')
----> 3             .label_from_folder(classes=['neg','pos'])
      4             .filter_missing_y()
      5             .databunch(bs=bs))

TypeError: 'bool' object is not callable

This is what my directory structure looks like:

image

and inside of the neg and pos directory are .txt files.

I’m not really sure how to debug this further since it is inside of the data block, should I try going back to the other way of grabbing data?

Here is my version information:

=== Software === 
python version  : 3.6.6
fastai version  : 1.0.27
torch version   : 1.0.0.dev20181029
nvidia driver   : 396.37
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 16270MB | Quadro P5000

=== Environment === 
platform        : Linux-3.10.0-862.11.6.el7.x86_64-x86_64-with-centos-7.5.1804-Core
distro          : #1 SMP Tue Aug 14 21:49:04 UTC 2018
conda env       : kbird
python          : /home/kbird/.conda/envs/kbird/bin/python
sys.path        : 
/home/kbird/.conda/envs/kbird/lib/python36.zip
/home/kbird/.conda/envs/kbird/lib/python3.6
/home/kbird/.conda/envs/kbird/lib/python3.6/lib-dynload
/home/kbird/.local/lib/python3.6/site-packages
/home/kbird/.conda/envs/kbird/lib/python3.6/site-packages
/home/kbird/.conda/envs/kbird/lib/python3.6/site-packages/IPython/extensions
/home/kbird/.ipython
3 Likes

Hello, I’m getting the same error even with the updated fastai version (1.0.28). Did you find any solution?
BTW, how do you obtain all the versions of the libraries installed in your VM instance?

First, import the utils library from fastai:

from fastai.utils import *

Then the command to see the versions is:

show_install()

This also has an optional show_nvidia_smi parameter that will show your nvidia-smi output as well.

I haven’t found an issue to this yet. Are you running your own dataset or the imdb one?

1 Like

Thanks for the info. About the issue, I’m running in IMDB dataset (even with no modifications from my side). I think this post should be moved to fastai course v3 section (to obtain more visibility)

1 Like

You’re probably correct. I was going to leave it here since I was just having issues on my own dataset, but if you are seeing the same thing with the imdb data, it probably is best to put it over there to see if other people are seeing the same issue.

hi Kevin,

is this resolved? thank you! :slight_smile:

Not yet, I might try to dig into it but I don’t really know how to walk through the issue

i used an earlier version of the nb and it does not have the filter_missing_y, and it works. I’m not sure what this does, the classes=[‘neg’, ‘pos’]) seems already selecting only neg/pos not the unsup.

what do you think?

data_clas = (TextList.from_folder(path, vocab=data_lm.vocab)
#grab all the text files in path
.split_by_folder(valid=‘test’)
#split by train and valid folder (that only keeps ‘train’ and ‘test’ so no need to filter)
.label_from_folder(classes=[‘neg’, ‘pos’])
#remove docs with labels not in above list (i.e. ‘unsup’)

.filter_missing_y()

         #label them all with their folders
1 Like

I think you’re onto something. When I comment out .filter_missing_y() It doesn’t error.

I was searching in the github of the library and found that filter_missing_y is an object of TextList class so it can’t be callable, right?

class TextList(ItemList):
    _bunch = TextClasDataBunch
    _processor = [TokenizeProcessor, NumericalizeProcessor]

    def __init__(self, items:Iterator, vocab:Vocab=None, **kwargs):
        self.filter_missing_y = True
        super().__init__(items, **kwargs)
        self.vocab = vocab

Githhub for data.py

1 Like

I just tested this and it worked:

text_list = TextList.from_folder(path, vocab=data_lm.vocab)
text_list_split_by_folder = text_list.split_by_folder(valid='test')
text_list_labeled_by_folder = text_list_split_by_folder.label_from_folder(classes=['neg', 'pos'])

text_list_labeled_by_folder.filter_missing_y = True

data_clas = text_list_labeled_by_folder.databunch(bs=bs)
data_clas.save('tmp_clas')
6 Likes

I’ve been looking for the updated version of pretrained wt103 with the .pth path. here I found them:

http://files.fast.ai/models/wt103_v1/

That’s correct. That’s why when you separate out each operation into separate lines, it works correctly (as suggested by @fkautz).