hi…
My dataset labels has got 5 distinct classes but data.c gives only 1 . What might be the issue.??
Could you be a bit more specific? I presume that data
here is a DataBunch
. Could you show the steps you used to create it from your data? And in what form is your data? Also, what kind of problem are you working on?
src = (
CustomImageList.from_df(image_boost_df,PATH,cols='id_code',folder='train_images',suffix='.png')
.split_by_idx(val_idx)
.label_from_df(cols='diagnosis')
)
it has got 0 1,2,3,4 as labels distinctly.
two cols id_code,diagnosis which is label column
LabelLists;
Train: LabelList (4256 items)
x: CustomImageList
Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024)
y: FloatList
4.0,2.0,3.0,2.0,0.0
Path: …/input;
Valid: LabelList (550 items)
x: CustomImageList
Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024),Image (3, 1024, 1024)
y: FloatList
0.0,2.0,2.0,0.0,4.0
i resolved it…
diagnosis column was going in as float… i changed to Long now the category is category list…
Cool!
Not sure it was going to help, but one thought was that I didn’t see in your code how fastai knows that diagnosis
is a categorical variable. If it is a float, then it would make sense to consider the variable as continuous, and “otherwise” categorical. From the documentation it looks like fastai figures out whether a variable is continuous or categorical.
In particular I was going to mention the following paragraph:
_label_cls
is the first to be used in the data block API, in the labelling function. If this variable is set toNone
, the label class will be set toCategoryList
,MultiCategoryList
orFloatList
depending on the type of the first item. The default can be overridden by passing alabel_cls
in the kwargs of the labelling function.
Perhaps more to the point, from the documentation:
Behind the scenes,
ItemList.get_label_cls
basically select a label class according to the item type oflabels
, whereaslabels
can be any ofCollection
,pandas.core.frame.DataFrame
,pandas.core.series.Series
. If the list elements are of type string or integer,get_label_cls
will outputCategoryList
; they are of type float, then it will outputFloatList
; if they are of type Collection, then it will outputMultiCateogryList
.