Build mixed databunch and train end-to-end model for Tabular (categorical + continuous data) and Text data

(Quan Tran) #7

That sounds great! I’d love to collaborate on this. There are still few things to improve though: the export and predict functions, my code is still running really slow compared to the other model and I leave out a small portion of ULMFIT model (the SortishSampler). I will come back to this soon.

This is the other (faster) model implementation I was talking about: https://github.com/anhquan0412/fastai-tabular-text-demo/blob/master/mercari-tabular-text-version-2-all.ipynb
It does not require writing a new ItemLists and seems to be better in general. Maybe I will rewrite mine using this one.

1 Like

(Andreas Daiminger) #8

@quan.tran
Wow. Version 2 runs much faster. Before it took around 28 min to train one epoch of the fully unfrozen model and now it takes only around 10 min!! I used a slightly different model with more fully connected layers on top. And my problem is classification so I used CE and softmax.
Here is my model head before the softmax layer:

(layers): Sequential(
    (0): BatchNorm1d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (1): Dropout(p=0.5)
    (2): Linear(in_features=800, out_features=400, bias=True)
    (3): ReLU(inplace)
    (4): BatchNorm1d(400, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): Dropout(p=0.4)
    (6): Linear(in_features=400, out_features=200, bias=True)
    (7): ReLU(inplace)
    (8): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (9): Dropout(p=0.1)
    (10): Linear(in_features=200, out_features=23, bias=True)
  )

I noticed that some of the fastai databunch default funcionality is getting lost. But it shouldn’t be too hard to add stuff to print out data summaries and add a model.data.classes property.

I hope I have time this week to look into extending your code and make model.predict and model.export work.

0 Likes

(Andreas Daiminger) #9

Hey @quan.tran! It’s me again!
I did some minor tweaks to your code to make TabularTextProcessor.process_one work.
Do you want me to submit a pull request so you cab review the code?

1 Like

(Quan Tran) #10

Thanks for the help! I have added it to the repo

1 Like

(Andreas Daiminger) #11

I can confirn [this approach]( https://github.com/anhquan0412/fastai-tabular-text-demo/blob/master/mercari-tabular-text-version-2-all.ipynb ) works better.
It backrops over a entire tabular model before concatenating the output of the text and tabular model.

Where does this implementation come from? Did you do this as well?

0 Likes

(Quan Tran) #12

The implementation came from this: https://nbviewer.jupyter.org/gist/joshfp/b62b76eae95e6863cb511997b5a63118/5.full-deep-learning.ipynb which is from the ‘share your work’ thread in v1 I believe. Somehow combining from 2 different learner (text learner and tabular learner) is better than writing one new learner. I will probably look into it a bit more as I have more free time next week.

1 Like

#13

Hi, silly question, I’m a bit new to this API. How do I train a classification model instead of a regression model using your code? Can’t seem to figure it out, I’ve tried changing the label_cls to CategoryList and using CrossEntropyFlat but to no avail.

0 Likes

(Quan Tran) #14

Hi @Ayuei

I just tried to do classification with the petfinder dataset and it is still working. Make sure you have the label type as integer train['target']= train['target'].astype(np.int8) . If you do this you don’t even need to worry about the label_cls or loss function because the fastai library will auto detect them. For reference, this is how I set the learner for tabular-text classification task

learn= tabtext_learner(data,AWD_LSTM,metrics=[accuracy],
                               callback_fns=[partial(SaveModelCallback, monitor='accuracy',mode='max',every='improvement',name='best_nn')],
                               **params)

Let me know if it helps!

1 Like

(Andreas Daiminger) #15

Hey @quan.tran
How is it going? Are you still interested in working on this?
Currently it is not possible to make a prediction on a single data point with version 2. This means it is impossible to put the model in production. I am very interested in making this work!!!

0 Likes

(Kj) #16

So, I’ve got a bunch of pricing lists that I have to put into a specific format constantly. Would it be possible to use this to create a training dataset from the finalized version of formatting, to classify dep_var without matching label columns and predict cont and categorical variables based on the trained dataset?

0 Likes

(Quan Tran) #17

Hey @Andreas_Daiminger I was busy on other things and haven’t had a chance to look back at it. Do you have a list of functionality you want to have (beside the single data point prediction), as I have some time this weekend to play around with version 2 a bit (though I am not sure whether I’d be able to make v2 have all the functionalities as in v1 because v2 is fundamentally different)

0 Likes

(Quan Tran) #18

I am not sure what you mean by ‘to classify dep_var without matching label columns …’
Can you give me an example on what your dataset is like?

0 Likes

(Kj) #19

So essentially, I’ve got a database of 48,000 e-commerce items, vendors send me price sheets with updated pricing information on them, and none of the columns are ever labeled consistently. The only thing that is generally consistent is the SKU or Model number, which I have trained as the dep_var from the database. These SKU’s share the same row as the pricing information, but since the labels on columns can change it makes it a bit tricky. Any insights would be greatly appreciated!

0 Likes

(Andreas Daiminger) #20

Hey @quan.tran!
I would like to use v2 in production. So everything related with that would be a top priority.
First single data point prediction and then model.export (difficult … I know!!)
I had a look myself, but could not come up with a simple way to make single data point prediction work. If you point in the right direction I can help you develop a solution.
Thanks for keeping interest!

0 Likes

(Quan Tran) #22

So the column names are not consistent? or the value within each columns? I am still not sure what your dataset looks like. Can you provide the first 10 records in that dataset?

1 Like

(Quan Tran) #23

Hey! I finally had time to work on it a bit. Code for version 2 is factorized into modules , there’s a new cleaner notebook for that (https://github.com/anhquan0412/fastai-tabular-text-demo/blob/master/mercari-tabular-text-version-2-complete.ipynb) and the repo is updated. Also I have added the predict_one_item function for version 2: All you need is to provide a series with column names (like the output of df.loc[some_index]) and it will spit out the prediction and raw_prediction. Haven’t tried it on classification task yet, so give it a go and let me know if it works!

And about the export function, I am not sure what the purpose of that function: do you want to save the model so that you can load it somewhere? Is it somewhat similar to ‘model.save()’ ?

1 Like

(Andreas Daiminger) #24

@quan.tran Wow that was fast! Thanks a lot for the quick response!

The purpose of the export function is to prepare the model for inference. It’s like a lighter version of the learner. It can forget about learner.data and only needs to remember the model + its weights and the transforms it used or the normalization in the training data.

0 Likes

(Quan Tran) #25

So I just took a look at the export function in the fastai doc + source code, and I have an approach, though not sure if it’s gonna work: since the v2 basically just a combination of Tabular Learner and Text Learner with a concat head, you can export these 2 learners using existing export() function and (now the hard part) write a function to join them back with the concat head. All the data transformation will be taken care of by these 2 learners, and the concat head is just nn.Sequential.

1 Like

(Andreas Daiminger) #26

@quan.tran I understand. I can give it a try. But I have not done a lot of low level PyTorch programming. So this is hard for me.
I visualised the Model architecture of V2. This might be helpful for new collaborators, who want a quick high level overview.

0 Likes

(Kj) #27

Okay, sorry it took so long to get back to you. So… I can’t get you the exact items requested as there are privacy standards that we need to adhere to. But I did create a mock test training set to show you what the 48,000 Rows looks like, as well as a general standard for what we get the pricing changes and new items.

https://drive.google.com/drive/folders/1PAjj0l2n0AH_VukMMjLo6HK0u0oRCIcE?usp=sharing

0 Likes