I spent a little more time on this and have a simple auxiliary network and base network that need review and testing.
These are modeled after the networks described in the Diet Networks paper.
inp = Input(batch_shape=(big, small), name="Xt_input") #use batch shape here?
emb = Embedding(vocab_size, embed_size)(inp)
rs = Reshape((small * embed_size,))(emb)
out = Dense(h1_units)(rs)
mdl = Model(inp, out)
in2 = Input(batch_shape=(small, big), name="X_input")
h1 = Lambda(doDot, output_shape=(small, h1_units))([in2, mdl.output])
out = Dense(10, activation='softmax')(h1)
md2 = Model([inp, in2], out)
#helper function for lambda layer
X, Y = matrices
These models compile but I haven’t had a chance to run them yet.
- I don’t know how you’d train this in any sort of mini-batch fashion.
- I wasn’t totally confident in how I handled reshaping the
Embedding layer outputs in the auxiliary network. I’d love feedback.
- Obviously, I used a
Lambda layer to attempt to use the output of the auxiliary network,
We, as a weight matrix in the base network. I believe this works but again, I’d love a second opinion.
- If you do full batch learning (which the current code does) you likely don’t need two different inputs. I believe you can just reuse and transpose the original input in