I have a df, when I do df.head() I get the following:
name labels
0 000001.jpg short-sleeve-top;trousers
1 000002.jpg short-sleeve-top;short-sleeve-top
2 000003.jpg long-sleeve-dress
3 000004.jpg long-sleeve-dress
4 000005.jpg long-sleeve-dress
But when I do:
src = (ImageList.from_df(df, path, folder='train')
.split_by_rand_pct(0.2)
.label_from_df(cols='labels', label_delim=';')
.databunch().normalize(imagenet_stats)
)
I get the following error: UserWarning: There seems to be something wrong with your dataset, for example, in the first batch can't access any element of self.train_ds. Tried: 25411,63210,83328,49358,42785...
That is because those elements do not exist, they would all have leading zeros ie 025411,063210,083328,049358,042785.
How do I stop the label_from_df from stripping leading zeros? I’ve tried adding converters={'name': lambda x: str(x)}
to df = pd.read_csv(path/“labels.csv”) and the df still shows the image names with the leading zeros but I get the same error showing somewhere they’ve been stripped either between the df and the label_from_df or by _from_df