Difference in the sequnce of Image List and databunch

champs.jaideep · August 9, 2019, 3:32am

Hi ,
I did some experiment with sequence of item fetched by databunch and sequence in which open is called with path of fns.Results makes things confusing for me
Below is output when i build data bunch. I have put the print(fn) in open function of ImageList.
src = (
CustomImageList.from_df(df_train,PATH ,cols=‘id_code’,folder=‘train_images’,suffix=’.png’)
.split_by_idx(val_idx)
.label_from_df(cols=1,label_cls=FloatList)
)

data = (
        src.transform(tfms,size=sz,resize_method =ResizeMethod.SQUISH,padding_mode='zeros')
        .databunch(bs=bs)
        .normalize(imagenet_stats)
    )
fn ../input/aptos2019-blindness-detection/train_images/000c1434d8d7.png
fn ../input/aptos2019-blindness-detection/train_images/001639a390f0.png
fn ../input/aptos2019-blindness-detection/train_images/000c1434d8d7.png
fn ../input/aptos2019-blindness-detection/train_images/2ecbc2e3f239.png
fn ../input/retinopathy-train-2015/rescaled_train_896/rescaled_train_896/5695_right.png
df_train[df_train.id_code.isin(['000c1434d8d7','001639a390f0','000c1434d8d7','2ecbc2e3f239','5695_right'])]

0	000c1434d8d7	2.0
1	001639a390f0	4.0
647	2ecbc2e3f239	1.0
4996	5695_right	1.0

Label sequence  data.train_ds.y.items[0:5]
array([2., 1., 0., 0., 4.], dtype=float32)

sequence of labels in first case is 2,4,1,1 which is different from what fetched by y.items. shouldnt both be same ?