Lesson 3 technique on Boolean value

Hi,

I am trying to apply lesson 3 on the kaggle titanic, just wondering whether I can use for Boolean (0 or 1) type values? And how? Notice lesson 3 are sales float

1 Like

0 and 1 are perfectly reasonable float values :slight_smile:

Ah thanks, quick question on the following part
df, y, nas, mapper = proc_df(new_train_data, ‘Survived’, do_scale=True)
yl = np.log(y)
I got some error RuntimeWarning: divide by zero encountered in log

I am not sure how should I go around that, should I treat yl as y that has value 0 and 1?

You don’t need to take the log. We only did that because in that particular competition the eval metric used log.

2 Likes

Another quick question, what does the following code do ?
cat_sz = {c: len(joined_samp[c].cat.categories)+1 for c in cat_vars}
emb_szs = [(c, min(50, (c+1)//2)) for _,c in cat_sz.items()]

especially emb_szs ?

We’ll be covering that tonight! :slight_smile:

Ah cann’t wait ! thanks