How to use label_from_df in data block api

Hi,

I have a dataframe that has a list of filenames and a target column but I can’t seem to figure out how to use label_from_df.

Is the filename in the df supposed to have the full path or just the path after the “path”?
Is the filename in the df supposed to have the filename extension or just the part without the extension (“suffix”)?
Can label_from_df be used for regression or only for classification?

This is my df:
image

this is how i am trying to call label_from_df:

path = Path(’/home/matt/Dropbox (Centosette)/transfer/regression/test1/images’)
data = (ImageFileList.from_folder(path)
.label_from_df(df, ‘filename’, ‘y_coordinate’)
)

It’s a good thing no one answered this yesterday cuz however its supposed to be used its changed since then!

I’m making some progress with this code:

data = (ImageItemList.from_df(df, path, col='filename', suffix=".png")
        .random_split_by_pct()
        .label_from_df('y_coordinate')
        .databunch()
       )

my dataframe has 2 columns. 1 named filename and 1 named y_coordinate with my regression target.

2 Likes

@source99, AFAIK label_from_df is meant for classification

I was able to get it working with regression because my target column is type float.

@source, Ah! good to know!! Thanks for sharing
I hope that in docs.fast.ai they clean up terminology label/class/category and now target