Hey there!
I most recently found an interesting post about the idea to use an intermitten layers output as input for clustering.
I know, that Embeddings can be used as such as well (NPL: Using fastai word embeddings to cluster unlabeled documents) but I would be interested in perform something similar to the idea shown here (mnist-embedding/notebooks/mnist-embedding-autoencoder.ipynb at master · botkop/mnist-embedding · GitHub).
The writer of the notebook flattened the MNIST dataset such that from the nxn matrix he gets an (n*n, 1) or a row in a table with an additional column, which is the target.
After that, using a tabular_learner, he fitted a network to the data. All this has been done with fastai-v1 (since there are still databunches), but that is not hard to translate to v2.
There are two parts where following this idea is kind of hard:
first, in field 30, his structure is flat, while a tabular_learner nowadays creates nested structures such as
Sequential(
(0): LinBnDrop(
(0): Linear(in_features=114, out_features=200, bias=False)
(1): ReLU(inplace=True)
(2): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.1, inplace=False)
)
(1): LinBnDrop(
(0): Linear(in_features=200, out_features=100, bias=False)
(1): ReLU(inplace=True)
(2): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.2, inplace=False)
)
(2): LinBnDrop(
(0): Linear(in_features=100, out_features=1, bias=True)
)
(3): SigmoidRange(low=0, high=0.8)
)
A way around is to use learner.model._modules
or flatten_model
, so not the biggest problem, I think (even though it took some time to find these).
But, whenever I attempt to get to something akin to [45], learner.get_preds, the system tells me that
Using a target size (torch.Size([64])) that is different to the input size (torch.Size([6400])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size
RuntimeError: The size of tensor a (6400) must match the size of tensor b (64) at non-singleton dimension 0
And this is where I am a bit stumped.
To recap:
- I am using other data, thus my input is slightly different
- I am using the standard layers ([200, 100]) in a tabular_learner
- I trained it with batch_size 64, that worked fine
- I got the last two (2, 3) layers cut off as well as the ReLU-Dropout part of the second layer (1).
Expected behaviour is: put data in, run through the existing parts, spit out 100 features per row (1.0, out_features=100);
Observed behaviour appears to be: learner expected to give out a batch sized tensor, but cannot deal with the current result (which would be a 64x100 tensor, I hope).
So I can either perform learner.forward(…) for all batches or someone would not mind helping me
Anything is appreciated!
Cheers!