i’ve been playing with an idea today for encoding tabular data into images so i can throw them at a CNN (mainly because fastai.tabular seems to hate me). it’s based off a paper i saw where they were literally putting the variable values into the images but i don’t see the point in wasting training time getting it to understand values i can encode as colors so i use a greyscale image with colored blocks. i’m personally getting better results with blocks than numbers.
i was playing around with titanic on kaggle and so far i’ve managed 0.796 with resnet18 and i seem to be getting my best results on simpler architectures and minimal training (0.796 was 2 lots of 5 epochs without unfreezing).
i’m wondering if there are other architectures or training tricks i should be looking at for something so simple. my data looks like this.
there will be a blog post and a repo with some code when i’m done but google drive and colab are throwing a hissy-fit so i’m done for today.
Very interesting approach. I know exactly the paper you are talking about and it was discussed on the forum before (can’t find the link right now). I looked into it too, potentially embedding colors directly into it but did not see too much improvement.
A few ideas I’ve thought about trying is yes a simple architecture seems like the better way to go, probably one not pre-trained I think I found. Another potential would be the xresnet’s, potentially MobileNet, and of course efficientnet
i tried pretrained false and unfreeze right away but it didn’t make a lot of difference to me. in 5 epochs i was in pretty much the same place, it just took bigger steps to get there with the untrained model.
i was looking at the models in the zoo, and i can probably just try them all but i wondered if there was an understood approach to for simpler problems.
I think when i tried my own version of SuperTML I wound up integrating color into it too (not just B/W), using a color gradient of sorts for the cardinalities
quick question, off the top of your head, does anyone have any idea why i start with this:
then do this:
from fastai.vision.image import Image
import torchvision.transforms as tfms
img_pil = normal_img_from_my_code()
img_tensor = tfms.ToTensor()(img_pil)
img_fastai = Image(img_tensor)
and end up with this?
i feel like i’m missing out on an opportunity to encode something more into the images but i don’t understand why it’s going from A to B.