Data.classes does not match the number of tag categories

This is the kaggle image classification exercise I’m doing:
kaggle

This is my code:

train_df=pd.read_csv(data_dir/‘train.csv’)
test_df=pd.read_csv(data_dir/‘sample_submission.csv’)
cls=train_df[‘Id’].unique()
print(len(cls))
test_images=ImageList.from_df(test_df,path=data_dir/‘train’,folder=‘test’)
data=(ImageList.from_df(train_df,path=data_dir/‘train’,folder=‘train’)
.split_by_rand_pct()
.label_from_df()
.add_test(test_images)
.transform(get_transforms(),size=299)
.databunch()
.normalize(imagenet_stats))
print(data.c)

根据dataframe的unique方法计算出类别的种类应该有4251,但是创建出来data的classes数量却不是这个,这似乎导致了最终preds时通道数跟类别数不一样。

Can someone explain to me the difference between category quantity and data.c