Multiple Object Detection with pascal notebook

I’m following along the pascal notebook with my own data and am running into an issue during this step:

def get_data(bs, size):
    src = ObjectItemList.from_folder('C:/me/Desktop/Bbox/Train')
    src = src.split_by_files(val_images)
    src = src.label_from_func(get_y_func)
    src = src.transform(get_transforms(), size=size, tfm_y=True)
    return src.databunch(path=path, bs=8, collate_fn=bb_pad_collate)

Running ‘data = get_data(8, 2048)’ throws a warning that my Training set is empty. Not sure why for this one.

Running data.show_batch(rows=3) yields this error:


StopIteration                             Traceback (most recent call last)
<ipython-input-36-5d56d267491c> in <module>
  1 #show pictures
----> 2 data.show_batch(rows=3)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_data.py in show_batch(self, rows, ds_type, reverse, **kwargs)
183     def show_batch(self, rows:int=5, ds_type:DatasetType=DatasetType.Train, reverse:bool=False, **kwargs)->None:
184         "Show a batch of data in `ds_type` on a few `rows`."
--> 185         x,y = self.one_batch(ds_type, True, True)
186         if reverse: x,y = x.flip(0),y.flip(0)
187         n_items = rows **2 if self.train_ds.x._square_show else rows

~\AppData\Local\Continuum\anaconda3\lib\site-packages\fastai\basic_data.py in one_batch(self, ds_type, detach, denorm, cpu)
166         w = self.num_workers
167         self.num_workers = 0
--> 168         try:     x,y = next(iter(dl))
169         finally: self.num_workers = w
170         if detach: x,y = to_detach(x,cpu=cpu),to_detach(y,cpu=cpu)

StopIteration: 

So, I tried to use src = src.split_by_rand_pct() instead of src = src.split_by_files(val_images) inside the get_data function. This avoids the first warning and Iteration Error, and even shows the pictures with bounding boxes. The issue here is that some of them aren’t showing the correct locations of the boxes. It seems as though they have been shuffled around and I am not sure why - possibly from the random split?

Anyways, I am stuck here and any help would be much appreciated. Thanks!

Update: Ran some tests regarding the incorrect bbox labels. The items in the images are shown correctly about 60% of the time (keep in mind this has nothing to do with machine learning, this should be 100% as it is just loading the data)- both with split_by_rand_pct() and split_none(). So, the splitting does not seem to be affecting this. I figured my list of bbox coordinates may have been in a bad order and thus would be shuffled around during the splitting stage, but this seems to show that does not seem to be the case.

However this does make it seem there is an error either with my list of images or my list of bbox label coordinates. If anyone could point me in the right direction as to which of the two it would be that would be great. Thanks.

UPDATE: some of my bbox tags were deleted which happened to shift the rest of my data. Fixing this solved my problem.