I’m working through lesson 12 notebook “translate-pytorch” and cheak the Attention Model Parameters with:
“for name, param in decoder.named_parameters():
"emb.weight torch.Size([19549, 100]) True
gru.weight_ih_l0 torch.Size([384, 128]) True
gru.weight_hh_l0 torch.Size([384, 128]) True
gru.bias_ih_l0 torch.Size([384]) True
gru.bias_hh_l0 torch.Size([384]) True
gru.weight_ih_l1 torch.Size([384, 128]) True
gru.weight_hh_l1 torch.Size([384, 128]) True
gru.bias_ih_l1 torch.Size([384]) True
gru.bias_hh_l1 torch.Size([384]) True
out.weight torch.Size([19549, 128]) True
out.bias torch.Size([19549]) True"
them I’ve changed the Var function:
def Var(*sz): return nn.Parameter(Arr(*sz)).cuda()
def Var(*sz): return nn.Parameter(Arr(*sz).cuda())
as sugested in https://discuss.pytorch.org/t/solved-nn-parameter-on-gpu-not-appear-in-list-parameters/5904
“W1 torch.Size([128, 128]) True
W2 torch.Size([128, 128]) True
W3 torch.Size([228, 128]) True
b2 torch.Size([128]) True
b3 torch.Size([128]) True
V torch.Size([128]) True
emb.weight torch.Size([19549, 100]) True
gru.weight_ih_l0 torch.Size([384, 128]) True
gru.weight_hh_l0 torch.Size([384, 128]) True
gru.bias_ih_l0 torch.Size([384]) True
gru.bias_hh_l0 torch.Size([384]) True
out.weight torch.Size([19549, 128]) True
out.bias torch.Size([19549]) True”
Could it be that the results are still good even if the Attention parameters did not train? Or they are training bu do not appear on the list?
Best regards,