TextLMDataBunch.Show_Batch errors with batch size smaller than rows

KevinB · November 21, 2018, 9:40pm

It looks like TextLMDataBunch.Show_Batch is actually being taken out of the code in the most recent versions so none of this may be helpful.

I have been working through an issue where when I was trying to do show_batch() using a TextLMDataBunch I was getting the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-48-f346687833ba> in <module>
----> 1 data_lm.show_batch()

~/.conda/envs/kbird/lib/python3.6/site-packages/fastai/text/data.py in show_batch(self, sep, ds_type, rows, max_len)
    227         items = [['idx','text']]
    228         for i in range(rows):
--> 229             inp = self.x[:,i] if max_len is None else x[:,i][:max_len]
    230             items.append([str(i), self.train_ds.vocab.textify(inp.cpu(), sep=sep)])
    231         display(HTML(_text2html_table(items, [5,95])))

IndexError: index 9 is out of bounds for dimension 1 with size 9

So after digging into it a bit, I found this to be the culprit (You would think I could just look at the arrow, but for me, it required digging):

inp = self.x[:,i] if max_len is None else x[:,i][:max_len]

A few things I want to check on here:
#1. Is there a reason that the first part uses self.x and the second uses just x which is generated above.
#2. Would it make sense to instead transpose x at an earlier point?
#3. Do we want something that says if there isn’t enough text to display {rows} rows, just display the max in one iter or would it be better to keep iterating through the dataloader until reaching the end.

The simplest way to deal with #3 would be to just load one batch and if it isn’t big enough, just use that as rows instead. Here are my total proposed changes:

def show_batch(self, sep=' ', ds_type:DatasetType=DatasetType.Train, rows:int=10, max_len:int=100):
    "Show `rows` texts from a batch of `ds_type`, tokens are joined with `sep`, truncated at `max_len`."
    from IPython.display import display, HTML
    dl = self.dl(ds_type)
    x,y = next(iter(dl))
    x=x.transpose(0,1)
    items = [['idx','text']]
    rows = x.shape[0] if rows > x.shape[0] else rows
    for i in range(rows):
        inp = x[i] if max_len is None else x[i][:max_len]
        items.append([str(i), self.train_ds.vocab.textify(inp.cpu(), sep=sep)])
    display(HTML(text2html_table(items, [5,95])))

It doesn’t look like this is probably useful anymore, but I guess it at least may help somebody if they see that error on the current version.

Here is my current Install:

=== Software === 
python version  : 3.6.6
fastai version  : 1.0.28
torch version   : 1.0.0.dev20181029
nvidia driver   : 396.37
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 16270MB | Quadro P5000

=== Environment === 
platform        : Linux-3.10.0-862.11.6.el7.x86_64-x86_64-with-centos-7.5.1804-Core
distro          : #1 SMP Tue Aug 14 21:49:04 UTC 2018
conda env       : kbird
python          : /home/kbird/.conda/envs/kbird/bin/python
sys.path        : 
/home/kbird/.conda/envs/kbird/lib/python36.zip
/home/kbird/.conda/envs/kbird/lib/python3.6
/home/kbird/.conda/envs/kbird/lib/python3.6/lib-dynload
/home/kbird/.local/lib/python3.6/site-packages
/home/kbird/.conda/envs/kbird/lib/python3.6/site-packages
/home/kbird/.conda/envs/kbird/lib/python3.6/site-packages/IPython/extensions
/home/kbird/.ipython

It looks like this is actually being taken out of the code in the most recent versions so it may not be helpful.

quodatlas · November 29, 2018, 9:00pm

Thanks for this!

I’ve got a show_batch error as well, in conjunction with CollabDataBunch and the standard lesson4-collab notebook.

While revisiting lesson4-collab, encountered the following error when running: data.show_batch():

IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

This is with the latest github repository and FastAI library version (1.0.30).

Before going down the rabbit-hole of debugging, just wanted to check in to see if show_batch being deprecated with text, or if there any easy fix I am missing?

Thanks for any help!

P.S. Full error below:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-64-08d0dd5c9f7c> in <module>()
----> 1 data.show_batch()

/anaconda/envs/py36/lib/python3.6/site-packages/fastai/basic_data.py in show_batch(self, rows, ds_type, **kwargs)
    155         x,y = self.one_batch(ds_type, True, True)
    156         if self.train_ds.x._square_show: rows = rows ** 2
--> 157         xs = [self.train_ds.x.reconstruct(grab_idx(x, i, self._batch_first)) for i in range(rows)]
    158         #TODO: get rid of has_arg if possible
    159         if has_arg(self.train_ds.y.reconstruct, 'x'):

/anaconda/envs/py36/lib/python3.6/site-packages/fastai/basic_data.py in <listcomp>(.0)
    155         x,y = self.one_batch(ds_type, True, True)
    156         if self.train_ds.x._square_show: rows = rows ** 2
--> 157         xs = [self.train_ds.x.reconstruct(grab_idx(x, i, self._batch_first)) for i in range(rows)]
    158         #TODO: get rid of has_arg if possible
    159         if has_arg(self.train_ds.y.reconstruct, 'x'):

/anaconda/envs/py36/lib/python3.6/site-packages/fastai/tabular/data.py in reconstruct(self, t)
    122 
    123     def reconstruct(self, t:Tensor):
--> 124         return self._item_cls(t[0], t[1], self.classes, self.col_names)
    125 
    126     def show_xys(self, xs, ys)->None:

/anaconda/envs/py36/lib/python3.6/site-packages/fastai/collab.py in __init__(self, cats, conts, classes, names)
     12     def __init__(self, cats, conts, classes, names):
     13         super().__init__(cats, conts, classes, names)
---> 14         self.data = [self.data[0][0],self.data[0][1]]
     15 
     16 class CollabList(TabularList): _item_cls,_label_cls = CollabLine,FloatList

IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

sgugger · November 29, 2018, 9:59pm

Nope, that was a bug, thanks for flagging! It’s now fixed in master.

quodatlas · November 29, 2018, 10:01pm

Thanks so much!

quodatlas · November 29, 2018, 10:47pm

Forgive one more newbie question - using Conda to install Fastai. Notice that the bug has been fixed main repo, but that the Conda channel does not have it yet. How often are the changes pushed up?

Or rather, should I follow the developer directions in the readme and install a local version of the FastAI library?

Thanks so much for any time you can spare, and apologies if I’ve missed something.

sgugger · November 29, 2018, 11:10pm

I don’t know when the next release will be. Until then, you should use a developer install from the repo if you want the fix.

KevinB · November 30, 2018, 12:02am

Is the release automatic or do you guys determine when releases happen? Just kind of curious what that process looks like.

sgugger · November 30, 2018, 12:03am

Jeremy decides it.

gob · December 5, 2018, 6:38pm

How to set batch size? Not so easy for me actually. Sounds easy of course. My GPU out of memory using default bs.

Please bear with me, I could be just not finding easy solution. Here’s what I have:

Tricky thing: No bs param in from_csv() that I can find. No bs param in language_model_learner() that I can find. No bs param in the Also I get attribute error when setting the property directly in the already-instantiated data_lm or the learner.

I could fit more in memory using bptt but I’d rather keep bptt nice and big, but reduce the bs.

Thanks for tip.

Clarification – CUrrently using TextLMDataBunch.from_csv() and language_model_learner().

AbuFadl · December 5, 2018, 7:48pm

You can pass bs =n in from_csv method. On colab, I use 32 (higher seem to eat all cuda memory - usually less than 3 GB max).

gob · December 6, 2018, 1:44pm

I wish I could. The bs parameter is actually missing from the pop-up (Shift-tab) docs. Are you saying it’s actually there anyway? Maybe the docstring is just mistaken? Could be, and I’ll try that. I’m not sure if it was developed auto-generated using reflection on the actual code, or was it hand made and thus susceptible to being unsynch’d. Here’s the actual pop-up docs:

Signature: TextLMDataBunch.from_csv(path:Union[pathlib.Path, str], csv_name, valid_pct:float=0.2, test:Union[str, NoneType]=None, tokenizer:fastai.text.transform.Tokenizer=None, vocab:fastai.text.transform.Vocab=None, classes:Collection[str]=None, header=‘infer’, text_cols:Union[int, Collection[int], str, Collection[str]]=1, label_cols:Union[int, Collection[int], str, Collection[str]]=0, label_delim:str=None, **kwargs) → fastai.basic_data.DataBunch
Docstring: Create a TextDataBunch from texts in csv files.

That last param **kwargs is interesting now that I look at it all again. THis could explain it. It takes literally anything as a parameter, right? I could pass my dog to it. Then if if dogs aren;t allowed I’d get a runtime error. So that’s my latest hypothesis, and I’m off to do some more experimentation to see if it works that way with bs.

AbuFadl · December 6, 2018, 4:24pm

Yes, it’s in **kwargs and won’t pop-up in auto completion.

cbaumgartner · January 7, 2019, 11:00pm

@sgugger I think I’m having a similar problem:

I’m working with image data (pictures of 1st, 2nd, and 3rd degree burns) split into three folders with folder-names: 1, 2, and 3. This works just fine:

data = (ImageDataBunch.from_folder(PATH, ds_tfms = get_transforms(flip_vert = True), 
                              valid_pct = 0.15, size = sz, bs = bs))

but the classes are in the wrong order (which messes with the confusion matrix):

data.classes
['1', '3', '2']

So I decided to use the data block API so I could make use of .label_from_list() as follows:

# using the datablock API
data = (ImageItemList.from_folder(PATH)
       .random_split_by_pct(valid_pct = 0.15)
       .label_from_list(['1', '2', '3'])
       .transform(get_transforms(flip_vert = True))
       .databunch())

The classes are now in the correct order:

data.classes
['1', '2', '3']

But when I run:

data.show_batch(3, figsize=(12,6), hide_axis=True)

I now get the following error (which wasn’t happening without the data block API):

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-177-f70d393d3ddc> in <module>
  1 
----> 2 data.show_batch(3, figsize=(12,6), hide_axis=True)

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/basic_data.py in show_batch(self, rows, ds_type, **kwargs)
119         if rows is None: rows = int(math.sqrt(len(b_idx)))
120         ds = dl.dataset
--> 121         ds[0][0].show_batch(b_idx, rows, ds, **kwargs)
122 
123     @property

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/vision/image.py in show_batch(self, idxs, rows, ds, figsize, **kwargs)
242         fig, axs = plt.subplots(rows,rows,figsize=figsize)
243         for i, ax in zip(idxs[:rows*rows], axs.flatten()):
--> 244             x,y = ds[i]
245             x.show(ax=ax, y=y, **kwargs)
246         plt.tight_layout()

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py in __getitem__(self, idxs)
352     def __getitem__(self,idxs:Union[int,np.ndarray])->'LabelList':
353         if isinstance(try_int(idxs), int):
--> 354             if self.item is None: x,y = self.x[idxs],self.y[idxs]
355             else:                 x,y = self.item   ,0
356             if self.tfms:

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py in __getitem__(self, idxs)
 77 
 78     def __getitem__(self,idxs:int)->Any:
---> 79         if isinstance(try_int(idxs), int): return self.get(idxs)
 80         else: return self.new(self.items[idxs], xtra=index_row(self.xtra, idxs))
 81 

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py in get(self, i)
226 
227     def get(self, i):
--> 228         o = super().get(i)
229         return self._item_cls.create(o, self.class2idx)
230 

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py in get(self, i)
 53     def __repr__(self)->str: return f'{self.__class__.__name__} ({len(self)} items)\n{self.items}\nPath: {self.path}'
 54     def get(self, i)->Any:
---> 55         item = self.items[i]
 56         return self.create_func(item) if self.create_func else item
 57 

IndexError: index 57 is out of bounds for axis 0 with size 3

I think ‘Index 57’ is just the index of the random image that the ‘show_batch’ method wants to grab from the data (the number ‘57’ is different every time I run the command, but always between 0 and the number of images in my dataset), and ‘axis 0 with size 3’ refers to the number of classes in my dataset: 3.

I think something is just going wrong with axis mismatching. Any idea?

sgugger · January 8, 2019, 12:57pm

Did you update your fastai to the latest version? Classes are now sorted so even with ImageDataBunch.bla you should get them in order.

cbaumgartner · January 8, 2019, 3:42pm

I upgraded. Classes are fixed, but now I’m running into a number of other issues. Is there a stable version of the library that you would recommend?

sgugger · January 8, 2019, 4:39pm

The latest release (1.0.39) is the most stable I know. There have been a few breaking changes in the APIs (see the changelog for more details).

cbaumgartner · January 8, 2019, 4:42pm

Update:

I’m using a paperspace instance and upgraded to the latest version this morning.

The non-datablock API method works just fine, and the classes are now in the correct order.

I’m still running into the index out of bound error when using the data block API and looking at a batch using data.show_batch. The model trains just fine using the data created via the data block method, so I don’t think it’s a problem with the underlying data. Is there anything else I might be doing wrong, or is there a bug in the code?

cbaumgartner · January 8, 2019, 5:07pm

I figured out the source of the problem.

When using .label_from_list([‘1’,‘2’,‘3’]) I run into the problem, when using .label_from_folder() everything works. I probably need to dive deeper into the .label_from_list() code to see whats actually going on.

Thanks for the help!

sgugger · January 8, 2019, 5:40pm

label_from_list requires a generator or an ItemList with the same length as your items, so I’m guessing there should be more than 3 in your case.

cbaumgartner · January 8, 2019, 6:14pm

Gotcha, I should have looked at the label_from_list docs closer. Thanks for clarifying.

2 questions:

When using label_from_list, if the data directory contains a bunch of data (such as image files) does label_from_list assign labels sequentially in the order the data appears in the folder? (assuming the list passed to label_from_list is of the proper length)
If there are subdirectories in the data directory (i.e. data/folder_1, data/folder_2, data/folder_3), can we still use label_from_list, and if yes, are the labels assigned recursively & sequentially according to the order of the subdirectories in the data directory and the order of the data in the subdirectories? (again assuming the list passed to label_from_list is of the proper length)

Thanks.