Weighted dataloaders with TabularPandas

bilalUWE · November 15, 2022, 9:12pm

Hi,

I am working on a healthcare problem that involves tabular dataset for training the ML model. The data has a huge class imbalance as shown below:

Where the y-axis is representing the levels of cardiac rejection in transplant patients. There are more records in class 0 than in the rest. The learner is struggling to predict the rare classes as accurately as shown in the confusion matrix below:

Can we use weighted dataloaders like vision problems with TabularPandas? I have tried it but getting the following error:

Any ideas about what am I missing here?

Many thanks and
Kind regards,
Bilal

Archaeologist · November 17, 2022, 1:28pm

Not sure if I understood your code correctly or if this is at all related to the error: I think weighteddl expects weights for the training dataset only. Is your wgts list perhaps too long as it seems to be computed from the entire dataframe?

A notebook that helped me to understand the topic better is here:

bilalUWE · November 18, 2022, 7:22am

Hi Jurgen,

Thanks for the tip.

Surprisingly, I used the same approach with vision and it worked.

I will look into this and report back.

Many thanks and

Best regards,
Bilal