Hello,
my name is Sascha and I am a newbie.
I first want to say thanks for this great software and the great course videos.
Im trying to use the tabular learner for the first time, without success.
My problem is that the accuracy will not improve to more than 59 %
Data: I use predictions from 5 different models.
I have 5 columns with values between 0 and 1
And a 6th column containing the target (Y).
220k rows are in the dataset.
Used code:
from fastai.tabular import *
dep_var = 'label'
cont_names = ['label_1', 'label_2', 'label_3', 'label_4', 'label_5']
data = (TabularList.from_df(ptrain, cont_names=cont_names)
.split_by_rand_pct(seed=78)
.label_from_df(cols=dep_var)
.add_test(TabularList.from_df(ptest, cont_names=cont_names))
.databunch())
learntab = tabular_learner(data,layers=[100,200,300],emb_drop=0.,metrics=accuracy)
from fastai.callbacks import *
learntab.fit_one_cycle(3, 1e-2,callbacks=[SaveModelCallback(learntab,monitor='accuracy',mode='max'),CSVLogger(learntab,filename='ensemble')])
|epoch|train_loss|valid_loss|accuracy|time|
|0|0.679971|0.681155|0.593958|00:39|
|1|0.678074|116.269554|0.595208|00:46|
|2|0.672019|1.463650|0.596004|00:46|
data.show_batch()
|label_1|label_2|label_3|label_4|label_5|target|
|0.0000|0.1037|0.7720|0.1834|0.0008|0|
|0.9990|0.4025|0.4366|0.0093|0.0012|1|
|0.4017|0.0004|0.1756|0.2037|0.0066|0|
|0.9954|0.0009|0.2378|0.0168|0.0049|0|
|0.0068|0.0108|0.1024|0.1048|0.0053|0|
(cat_x,cont_x),y = next(iter(data.train_dl))
for o in (cat_x, cont_x, y): print(to_np(o[:5]))
[0 0 0 0 0]
[[2.129060e-02 2.522550e-06 4.058339e-02 1.559591e-01 1.870525e-03]
[9.996858e-01 2.867426e-03 8.365842e-01 3.037727e-02 6.861029e-03]
[9.870045e-01 9.996849e-01 3.015203e-02 3.265378e-02 4.049588e-03]
[2.546746e-02 9.881952e-01 3.170503e-02 9.409481e-02 5.366698e-03]
[9.984887e-01 1.000000e+00 5.436885e-01 1.209527e-02 6.735846e-03]]
[0 0 0 1 0]
learntab.data.valid_ds.items
array([0, 1, 2, 3, ..., 43989, 43990, 43991, 43992], dtype=object)
learntab.data.train_ds.items
array([0, 1, 2, 3, ..., 175971, 175972, 175973, 175974], dtype=object)
I use kaggel kernels
fastai version 1.0.50.post1
Tried different layer structures from 10,20 to 100,200,300
differents dropouts and with/without batch normalisation
and i got always the same accuracy (59%)
Thanks for you help,
Sascha