Awesome!!! Thank you @mikaelh (I’ll go list those as potential work on the tabular baseline post!)
Hm, I’ve got a worse validation loss with that fix. What is your experience?
hey @grankin, yes I think you are right we screwed something with 1.0.2, the fix was that our implementation was not sharing any layers, so it was not doing what it was supposed to do. Now we are sharing blocks, but sharing the same batch norm layer for different inputs makes things impossible more diffucult, so we need to take out the batch norm from the shared blocks, we are working on it. I’ll let you know when it’s done, hopefully this will improve the algorithm : the research paper says with 4 independant layers they get 96.76 and what I got with 1.0.1 was 96.39
Hi @muellerzr. I am experimenting with the poker hand induction dataset and managed to achieve good results even with fastai v1.
The input must be exactly the same, for comparison, or can we try some pre processing and model tweaking?
I ask that, because I am managing to do a little tweak in tabular from fastai v1 and achieving > 75% (the model is still runnning by now).
The poker hands results depends on some ocasions that suits of different cards are the same, or that different ranks are the same. So, first, I copied the ranks and used them not only as conts, but also as cats. Secondly, an Ace at position 1 has the same rank as an Ace at position 2, so I created only two embeddings (suits and ranks), so every suit variable goes through the suits embedding, and all categorical ranks variables goes through the same embedding.
When I have some time, I will post a notebook with the approach.
In actuality we were able to achieve 91% or so if we treat everything as categorical what % were you getting? (This May be better suited for the tabular baseline thread)
My best result was 81%.
Just changed some parameters from your notebook and used ranks both in continuous and categorical forms and I could hit 99,24%. When it is over, I will post the notebook.
Wow!!! That’s fantastic!
With TabNet, the best I could get was 99,06. It sure converged faster. Maybe there are some tricks available (not known to me) that could speed up the fastai training. Once I resume the experiments will post both of them here.
@fmobrj75 when you post your results could you put them here: Some Baselines for other Tabular Datasets with fastai2 (The TabNet ones can go here though, just so we don’t diverge topics too much )
Sure. Will do it!
Awesome! Which TabNet implementation did you use, and would you mind sharing code?
Here is the link to the code I mentioned earlier:
I could get 99,48% with fastai2 for poker rule induction dataset (go to cell # 130 of the notebook). It was kind of a pain (lots of epochs), but I suspect it could get even better because the validation loss was still decreasing. I dont know if we could speed the training with some trick.
I managed to get 99,10% with TabNet, and also suspect it could also get better with more epochs.
@grankin new version has been updated to Pypi, so you can try out the new version that should work ok.
Let us know about your results!
Yes it works great, thank you! I’ve updated my package as well.
I can’t use your pypi package right now. It would be possible to use it if we remove
device parameter from nn.Module classes and split TabNet class in two - one for dealing with embeddings and one for ‘pure’ TabNet. Handling embeddings is framework-specific and is done differentely in fastai.
You can look at the example of proposed split here
I would be happy to provide a PR if such changes are desirable.
Great to hear that things are working correctly.
device everywhere, you are probably right and we were just over cautious by putting this in every function. If you need to separate the embeddings from the main network in order to be able to integrate directly the code to fastai and avoid code duplication then go for it! Feel free to open a PR and propose your changes and I’ll review this carefully.
I’ve checked against example notebooks, it seems to work.
For what it’s worth, I achieved 99.3% test accuracy with CatBoost using your pre-processing idea of having both continuous and categorical variables.
@grankin I’m trying to think about how we could incorporate this into the library because having two options (one where we can directly point to how a model behaves and one that we can “somewhat”) and both can get almost the same exact accuracy could be very valuable! Perhaps something like a
tabnet_learner? (we’d have a few weeks to figure out if we wanted to because Jeremy et al are working on finishing their book)
@muellerzr great idea! I believe it’s worth digging into explainability of TabNet.