Sharing the same embeddings over multiple columns

Hi,

I have a tabular dataset and I came across the following problem:
Let’s say I have 2 or more columns with the same type of input, for exmaple “playerid_1” and “playerid_2”.
Each one of them is a categorical variable, on one row the id can be at “playerid_1” and on another the same id can be under “playerid_2” column.
I want both columns to share the same embedding matrix and couldn’t find how to implement it.

Thanks in advance.

2 Likes

You could try to convert to one hot encoding ?

Maybe you can just re-use your embedding layer for the same column types?
Something like this,
lets say you have the input x with shape = [bs, columns]

playerids = [2, 4]  # column ID's that should share embedding
emb = nn.Embedding(a, b)  # create your embedding layer
out = list()
for c in playerids:  # iterate over the ID's and put them into your embeding layer
   out.append(emb(x[:,c])
out=  torch.stack(out) 
   
1 Like