Structured Learner


#25

Try changing the loss function to F.cross_entropy in ‘StructuredLearner’ class in column_data.py.


(Jeremy Howard (Admin)) #26

Or I think you can just move the crit= bit into the get_learner call.


(CVG) #27

Thanks. I tried changing the following functions.

1 . Changed F.mse_loss to F.cross_entropy and it threw some error
2. I tried to change the API of get_learner to pass crit = F.cross_entropy and it also threw error that crit was invalid argument. I had made change in relevant functions.

Not sure what all the mistakes I am doing

@jeremy : Is there a sample python notebook for classification task for structured data. I tried to go through the functions but I didn’t get too far in terms of getting the code working for classification at my end for structured data.


(Jeremy Howard (Admin)) #28

No, I’ve not tried it. You’re in unchartered waters!


(Kerem Turgutlu) #29

Hi what is the intuition behind this embedding weight initialization:

def emb_init(x):
    x = x.weight.data
    sc = 2/(x.size(1)+1)
    x.uniform_(-sc,sc)

Thanks


(Kerem Turgutlu) #30

I did the following, this is for binary classification only for multiclass F.cross entropy expects multi dimensional input and int target, so much more need to be changed for that:

class StructuredLearner(Learner):
    def __init__(self, data, models, **kwargs):
        super().__init__(data, models, **kwargs)
        if self.models.model.classify:
            self.crit = F.binary_cross_entropy
        else: self.crit = F.mse_loss


class MixedInputModel(nn.Module):
    def __init__(self, emb_szs, n_cont, emb_drop, out_sz, szs, drops, y_range=None, use_bn=False, classify=None):
        super().__init__() ## inherit from nn.Module parent class
        self.embs = nn.ModuleList([nn.Embedding(m, d) for m, d in emb_szs]) ## construct embeddings
        for emb in self.embs: emb_init(emb) ## initialize embedding weights
        n_emb = sum(e.embedding_dim for e in self.embs) ## get embedding dimension needed for 1st layer
        szs = [n_emb+n_cont] + szs ## add input layer to szs
        self.lins = nn.ModuleList([
            nn.Linear(szs[i], szs[i+1]) for i in range(len(szs)-1)]) ## create linear layers input, l1 -> l1, l2 ...
        self.bns = nn.ModuleList([
            nn.BatchNorm1d(sz) for sz in szs[1:]]) ## batchnormalization for hidden layers activations
        for o in self.lins: kaiming_normal(o.weight.data) ## init weights with kaiming normalization
        self.outp = nn.Linear(szs[-1], out_sz) ## create linear from last hidden layer to output
        kaiming_normal(self.outp.weight.data) ## do kaiming initialization
        
        self.emb_drop = nn.Dropout(emb_drop) ## embedding dropout, will zero out weights of embeddings
        self.drops = nn.ModuleList([nn.Dropout(drop) for drop in drops]) ## fc layer dropout
        self.bn = nn.BatchNorm1d(n_cont) # bacthnorm for continous data
        self.use_bn,self.y_range = use_bn,y_range 
        self.classify = classify
        
    def forward(self, x_cat, x_cont):
        x = [emb(x_cat[:, i]) for i, emb in enumerate(self.embs)] # takes necessary emb vectors 
        x = torch.cat(x, 1) ## concatenate along axis = 1 (columns - side by side) # this is our input from cats
        x = self.emb_drop(x) ## apply dropout to elements of embedding tensor
        x2 = self.bn(x_cont) ## apply batchnorm to continous variables
        x = torch.cat([x, x2], 1) ## concatenate cats and conts for final input
        for l, d, b in zip(self.lins, self.drops, self.bns):
            x = F.relu(l(x)) ## dotprod + non-linearity
            if self.use_bn: x = b(x) ## apply batchnorm activations
            x = d(x) ## apply dropout to activations
        x = self.outp(x) # we defined this externally just not to apply dropout to output
        if self.classify:
            x = F.sigmoid(x) # for classification
        else:
            x = F.sigmoid(x) ## scales the output between 0,1
            x = x*(self.y_range[1] - self.y_range[0]) ## scale output
            x = x + self.y_range[0] ## shift output
        return x


Learning Rate Pattern for Structured Data
(CVG) #31

Thanks. I will make changes at my end and try them.


(Jeremy Howard (Admin)) #32

IIRC that’s Kaiming He Initialization, although now I look at it I think I’m missing a sqrt there…


#33

Thanks for this code - trying to use this for a classification problem. How did you modify y, the dependent variable, for classification? I see binary_y, not sure what it does.


(Kerem Turgutlu) #34

So, in this code binary_y is 0s and 1s


(Kerem Turgutlu) #35

Hi,

When I am making validation predictions with learn.predict() every time I am getting different and odd results. While when I explicitly do the predictions still using learn.model I am getting the expected results. What is going on here :slight_smile:

Thanks


(Jeremy Howard (Admin)) #36

Sounds like you might be applying data augmentation to your validation set. You should only apply it to the training set.


(Kerem Turgutlu) #37

I followed these steps then fit the model;


(Jeremy Howard (Admin)) #38

You’re passing shuffle=True to your DataLoader :open_mouth:


(Kerem Turgutlu) #39

ooops you are right thanks !


(Vincenzo Lavorini) #40

Hello! Thank you for this snippet.
I would like to implement it, and so I modified the column_data.py file.
But where can I find the EmbeddingModel* classes that you call in the Jupyter screenshot? I don’t find them in the fast.ai library…

Thank you


(Kerem Turgutlu) #41

check sturctured.py and column_data.py. What better is that if you use PyCharm you can easily do this and search:

Class: Ctrl+N
File (directory): Ctrl+Shift+N
Symbol: Ctrl+Shift+Alt+N


(Richard Hall) #42

I don’t see EmbeddingModel in the current code for fast.ai on github, was it renamed to MixedInputModel?


(Kerem Turgutlu) #43

Yes that is correct. Embedding model’s is what I named it for this purpose. Original model name was always MixedInputModel (conts + cats)


(Rohit Singh) #44

@kcturgutlu can you share your notebook of which you’ve taken the screenshot? I tried following along but get an error in lr_find():

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-d81c6bd29d71> in <module>()
----> 1 learn.lr_find()

~/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/learner.py in lr_find(self, start_lr, end_lr, wds)
    135         layer_opt = self.get_layer_opt(start_lr, wds)
    136         self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr)
--> 137         self.fit_gen(self.model, self.data, layer_opt, 1)
    138         self.load('tmp')
    139 

~/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, metrics, callbacks, **kwargs)
     87         n_epoch = sum_geom(cycle_len if cycle_len else 1, cycle_mult, n_cycle)
     88         fit(model, data, n_epoch, layer_opt.opt, self.crit,
---> 89             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, **kwargs)
     90 
     91     def get_layer_groups(self): return self.models.get_layer_groups()

~/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/model.py in fit(model, data, epochs, opt, crit, metrics, callbacks, **kwargs)
     82         for (*x,y) in t:
     83             batch_num += 1
---> 84             loss = stepper.step(V(x),V(y))
     85             avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
     86             debias_loss = avg_loss / (1 - avg_mom**batch_num)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/model.py in step(self, xs, y)
     38     def step(self, xs, y):
     39         xtra = []
---> 40         output = self.m(*xs)
     41         if isinstance(output,(tuple,list)): output,*xtra = output
     42         self.opt.zero_grad()

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

<ipython-input-40-d4690f427dbe> in forward(self, x_cat, x_cont)
    134 
    135     def forward(self, x_cat, x_cont):
--> 136         x = [emb(x_cat[:, i]) for i, emb in enumerate(self.embs)]  # takes necessary emb vectors
    137         x = torch.cat(x, 1)  ## concatenate along axis = 1 (columns - side by side) # this is our input from cats
    138         x = self.emb_drop(x)  ## apply dropout to elements of embedding tensor

<ipython-input-40-d4690f427dbe> in <listcomp>(.0)
    134 
    135     def forward(self, x_cat, x_cont):
--> 136         x = [emb(x_cat[:, i]) for i, emb in enumerate(self.embs)]  # takes necessary emb vectors
    137         x = torch.cat(x, 1)  ## concatenate along axis = 1 (columns - side by side) # this is our input from cats
    138         x = self.emb_drop(x)  ## apply dropout to elements of embedding tensor

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/sparse.py in forward(self, input)
    101             input, self.weight,
    102             padding_idx, self.max_norm, self.norm_type,
--> 103             self.scale_grad_by_freq, self.sparse
    104         )
    105 

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/_functions/thnn/sparse.py in forward(cls, ctx, indices, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
     55 
     56         if indices.dim() == 1:
---> 57             output = torch.index_select(weight, 0, indices)
     58         else:
     59             output = torch.index_select(weight, 0, indices.view(-1))

TypeError: torch.index_select received an invalid combination of arguments - got (torch.FloatTensor, int, torch.cuda.LongTensor), but expected (torch.FloatTensor source, int dim, torch.LongTensor index)