# Lesson 5: using a simple Sequential model in Tabular Learner | take 2

Hi all,

to practice and get deeper understanding i tried to use a simple sequential model in tabular learner
now i used the titanic dataset with data cleaning and minimal feature engineering

my code is available in a colab here: Google Colab

from pathlib import Path
from fastai.tabular.all import *

modes = df.mode().iloc[0]
df.fillna(modes, inplace=True)

df['LogFare'] = np.log(df['Fare']+1)

df = pd.get_dummies(df, columns=["Sex","Pclass","Embarked"])

splits = RandomSplitter(seed=42)(df)

dls = TabularPandas(
df, splits=splits, procs=[Normalize],
cat_names=[],
cont_names=['Age', 'SibSp', 'Parch', 'LogFare', 'Sex_female', 'Sex_male', 'Pclass_1', 'Pclass_2', 'Pclass_3', 'Embarked_C', 'Embarked_Q', 'Embarked_S'],
y_names='Survived', y_block=CategoryBlock()

import torch.nn as nn

class NNet(nn.Module):
def __init__(self):
super(NNet, self).__init__()
self.nnet = nn.Sequential(
nn.Linear(12,10),
nn.ReLU(),
nn.Linear(10,10),
nn.ReLU(),
nn.Linear(10,1),
nn.Sigmoid()
)
def forward(self, _, x):
return self.nnet(x.view(-1,12))

model = NNet()

learn = Learner(dls, model=model, metrics=accuracy, loss_func=BCELossFlat(), cbs=ShowGraphCallback())

learn.fit(10, lr=0.03)

# i got this
epoch 	train_loss 	valid_loss 	accuracy 	time
0 	0.403331 	0.412062 	0.595506 	00:00
1 	0.397318 	0.387288 	0.595506 	00:00
2 	0.392440 	0.387936 	0.595506 	00:00
3 	0.389725 	0.391229 	0.595506 	00:00
4 	0.387327 	0.384920 	0.595506 	00:00
5 	0.386091 	0.381721 	0.595506 	00:00
6 	0.384088 	0.387873 	0.595506 	00:00
7 	0.382418 	0.382825 	0.595506 	00:00
8 	0.378637 	0.387598 	0.595506 	00:00
9 	0.377786 	0.384555 	0.595506 	00:00

the code runs without errors but there should be an issue somewhere because loss doesnât improve much and accuracy doesnât change at all

what am i doing wrong? what is the issue with my experiment?

thank you!

@muellerzr @benkarr any hints for me please?

Your metrics-value does not fit the output your model produces. If you have a look at accuracy?? youâll see that it takes the argmax of the predictions. Since your last layer has only one neuron the outputs shape is batch_size x 1 and argmax will always return 0 (hence you observe the same accuracy after each epoch). You can try to fix this by adjusting

• the metric:
def my_accuracy(inp, targ, axis=-1):
pred,targ = flatten_check(inp > 0.5, targ)
return (pred == targ).float().mean()

learn = Learner(dls, model=model, metrics=my_accuracy, loss_func=BCELossFlat(), cbs=ShowGraphCallback())
• or the model:
self.nnet = nn.Sequential(
nn.Linear(12,10),
nn.ReLU(),
nn.Linear(10,10),
nn.ReLU(),
#nn.Linear(10,1),
#nn.Sigmoid()
nn.Linear(10,2), ## one node for each category
nn.Softmax(),
)
## use appropriate `loss_func`
learn = Learner(dls, model=model, metrics=accuracy, loss_func=CrossEntropyLossFlat(), cbs=ShowGraphCallback())

Anyways: please read up on the Forum etiquette regarding @ mentioning random forum members.

2 Likes

thank you for your help! i am sorry for mentioning you; now i am aware of this is against forum etiquette; i just felt lost and stuck and desperate because after a week no one answered my help request
it wonât happen again!

your custom accuracy function makes my accuracy value feedback work but the situation is not clear and i feel confused
my model with one last neuron tries to solve a simple binary classification problem (Titanic) so why do i need to create a custom function for accuracy, why doesnât fast.ai do this out of the box as usual?
why does default accuracy uses argmax which is used for multi class classification?
also why should i add 2 neurons and softmax to my model for solving binary classification?
what am i thinking wrong?

Yeah, I kind of get that, so no worries. Just try to use that superpower of summoning people responsibly

Iâm actually not shure but would guess that it is a design choice of the library. It seems that single-label classifications are assumed to work with Cross Entropy and Binary Cross Entropy is used for multi-label classification, so the metrics for these tasks have particular presumptionsâŚ
No library can be prepared to solve every problem in every possible way, so you sometimes have to either:

• reformulate your problem (use two output neurons instead of one) or
• adjust the solution (change the metric).

Well binary means two , so you actually have two labels: âsurvivedâ and ânot survivedâ (binary classification is just a special case of multi class classification).

Lets take the one-output-neuron network:

âŚ
nn.Linear(10,1),
nn.Sigmoid()

The output of the linear layer is some number and sigmoid pushes that number between 0 and 1. Values in that range can be interpreted as probabilities such that if the output of the whole network is p we can think of it as âThe probability that this instance has the label âsurvivedâ is pâ. But this implicitly gives a second value, namely the probabilty of the instance being of label ânot survivedâ which is 1-p.

Now lets have a look at the two-output-neuron network:

âŚ
nn.Linear(10,2)
nn.Softmax()

The output of the linear layer are two values and softmax pushes them between 0 and 1 such that the sum of both is one. This again can be interpreted as probabilities where the first value p_0 gives the probability that the instance is of label ânot-survivedâ (0) and the second p_1 gives the probability that the instance is of label âsurvivedâ (1). Since softmax makes sure that they sum to 1, we have:

1 = p_0 + p_1 \iff p_0 = 1-p_1.

and you might see that both networks predict exactly the same things only that one predicts the second value explicitly rather than implicitly â it is just a reformulation of the problem and a choice of implementation.

1 Like

this all sound reasonable but âŚ devil in the details

in lesson 5 Jeremy demonstrates how easy and quick is to solve the Titanic problem compared to a from-scratch-solution (Lesson 5: Practical Deep Learning for Coders 2022 - YouTube)
in the dataloader he uses y_names=âSurvivedâ, y_block=CategoryBlock() just as i did, which means a single neuron output (i guess)
in the learner he just adds metric=accuracy (no custom metrics)
and tadaaâŚ everything works perfectly and really simple for him

so what is the difference? i know i used a custom Sequential model and no categorical embeddings, but the output seems to be the same

The model actually has two output neurons As I mentioned, the default way of fastai seems to be that for single-label classification (so a single label per instance), there is one output neuron for each kind of label and the Survived column provides two different labels (0/1 or ânot survivedâ/âsurvivedâ).

You can have a look at the model with:

learn.summary()

and see that

tabular_learner(dls, metrics=accuracy, layers=[10,10])

produces a model with 2 outputs The learner also uses Cross Entropy rather than BCE which you can check with:

learn.loss_func
FlattenedLoss of CrossEntropyLoss()
1 Like