Accuracy_multi = 0.000000

Hello,

I’m new to Deep Learning and trying to make some code to better understand how it all works. I have a tabular file of about 190,000 lines that looks like this:

Name AA SS A C D E
Name1 G C
Name2 N E
Name3 K C
Name4 H H
Name5 D H 0.6127
Name6 D H 0.451613
Name7 E E 0.393627
Name8 E C 0.617496
Name9 H E
Name10 R C
Name11 Q C
Name12 S H
Name13 H C
Name14 N H
Name15 Q E
Name16 K H
Name17 Q E
Name18 T C
Name19 T C
Name20 D H 0.392787

As you can see, I have a lot of missing data. I would like to make a model to predict the values of columns A, C, D and E based on columns AA and SS.

Here is my snippet of code:

from fastai.tabular.all import *

df = pd.read_table('Results.tsv')

splits = RandomSplitter(valid_pct=0.2)(range_of(df))

to = TabularPandas(df, procs=[Categorify, FillMissing,Normalize],
                   cat_names = ['AA', 'SS'],
                   y_names= ['A', 'C', 'D', 'E'],
                   splits=splits)

dls = to.dataloaders(bs=64)

learn = tabular_learner(dls, metrics=[accuracy_multi], loss_func = BCEWithLogitsLossFlat())

learn.fit_one_cycle(1)

Here is what I get.:

epoch     train_loss  valid_loss  accuracy_multi  time    
0         nan         nan         0.000000        00:01  

Do you have an idea of ​​the problem ?

Thanks

Hi !

Just a little up, please :slight_smile:

The results.tsv training data needs to have the columns A, C, D, E populated.
The learner needs to learn from the samples of input and output you provide.

Once trained, you can do predictions by providing just the AA and SS columns.