Bounding box dataset issue

ramon · December 8, 2018, 7:25am

I’m still struggling with getting an image bounding box dataset working.

I got some wrong input because on data.show_batch it dies on this lines of code:

/usr/local/lib/python3.6/site-packages/fastai/torch_core.py in tensor(x, *rest)
68 # XXX: Pytorch bug in dataloader using num_workers>0; TODO: create repro and report
69 if is_listy(x) and len(x)==0: return tensor(0)
—> 70 return torch.tensor(x) if is_listy(x) else as_tensor(x)
71
72 def np_address(x:np.ndarray)->int:

TypeError: can’t convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.

At that point x is “[list([0, 0, 100, 100]) list([200, 200, 100, 100])]”, which indeed results in an error when trying 'torch.tensor(x)"

I created the dataset like this:

data = (ObjectItemList.from_df(img_df, path, folder=‘’)
.random_split_by_pct()
.label_from_func(get_y_func)
.transform(get_transforms(), size=300, tfm_y=True)
.databunch(bs=20, collate_fn=bb_pad_collate))

img_df is a dataframe with images. img_df.head() gives:

I have simplified get_y_func to debug more easily by returning a fixed list of bounding boxes and labels for each image:

get_y_func = lambda o:[[[0,0,100,100],[200,200,100,100]],[‘text’,‘text’]]

Any suggestions/tips/help is welcome, thanks so much in advance!