Confused about ImageClassifierData

Hey guys, I’m real confused about how ImageClassifierData works.

I’ve tried implementing my own version of the pascal-multi notebook using my own labeled dataset.

I generated the MC_CSV correctly and compared it to the one originally generated from the notebook.

md = ImageClassifierData.from_csv(PATH, JPEGS, MC_CSV, tfms=tfms, bs=bs)

I get as far as concatenating the two data sets together using

trn_ds2 = ConcatLblDataset(md.trn_ds, trn_mcs)
val_ds2 = ConcatLblDataset(md.val_ds, val_mcs)
md.trn_dl.dataset = trn_ds2
md.val_dl.dataset = val_ds2

All of this works correctly, and the first part of the notebook runs correctly until we get to the next part using my newly created data in new directories

When I try to use

x,y=to_np(next(iter(md.trn_dl)))
x=md.trn_ds.ds.denorm(x)

Something goes incorrectly when calling md.trn_dl. For one reason or another, y is not generated correctly. I’m supposed to get a list with two numpy arrays of shape:

(64, 48)
(64, 12)

Instead, I get

64,16
64,1.

Now, i’m starting to think you guys have something hard coded in your from_csv method to some directory with an index of files. Sadly, this means the FASTAI code is not at all robust to new directories. Has anyone tried using this notebook with their own set of labeled images in their own format correctly?

I’ve had it working previously… Could you share a gist? What debugging have you tried?

Have you checked the source code for this?