Is there a way I can tell "learn.predict" to behave correctly when my targets are two CategroyBlock blocks?

So in my DataBlock I have this as my target:

CategoryBlock(vocab=vocab),
CategoryBlock(vocab=vocab)

The vocab is simply 0 to 127 (length is 128)

All looks fine when I show_batch and train … but when I call learn.predict the output is not what I expected:

(('(#128) [0,1,1,0,1,1,0,1,0,1...]', '(#128) [0,0,0,1,1,2,0,1,0,0...]'),
 tensor([[ 0.0350,  1.0502,  1.1848,  0.8337,  1.5485,  1.6603,  0.6151,  1.7123,
           0.0395,  1.3813,  2.9159,  0.8222,  2.4387,  1.2496,  0.0395, -1.3586,
          -0.6715,  3.2573,  1.4127,  1.8848,  1.3798,  2.2613, -2.5604, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264]]),
 tensor([[ 0.0350,  1.0502,  1.1848,  0.8337,  1.5485,  1.6603,  0.6151,  1.7123,
           0.0395,  1.3813,  2.9159,  0.8222,  2.4387,  1.2496,  0.0395, -1.3586,
          -0.6715,  3.2573,  1.4127,  1.8848,  1.3798,  2.2613, -2.5604, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264,
          -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264, -4.2264]]))

Perhaps, “learn.predict” can’t be used when predicting two classes? If so, I assume I imply have to create a test_dl and run it through learn.model()? I’m using a custom loss function as well (so that might be it too)???

Anyways, any ideas/thoughts would be appreciated. Thanks!

Are you sure your loss function has an activation an a decodes as shown here?

2 Likes

I don’t have any of those things … mine is just a plain ol’ nn.Module with a forward() :slight_smile:

If I understand the example …

  • forward is applied on the outputs of the mode, and calculates/returns the loss (returning a single number)
  • activation is applied on the outputs of the model, and returns the predicted probabilities
  • encodes is applied on what activation returns, and returns the actual predictions

Does that sound about right?

Thanks.

Yes, except it’s decodes and not encodes.

1 Like

So, I’m getting close.

I have two targets, both are defined by CategoryBlocks, and my custom loss function simply adds their losses together to produce a single loss. I want to be able to have learn.predict show the results of both targets

This works for learn.get_preds

    def activation(self, outs): 
        acts = []
        for i, o in enumerate(outs):
            acts.append(F.softmax(o))
            
        pdb.set_trace()
        return acts

    def decodes(self, outs):   
        decodes = []
        for i, o in enumerate(outs):
            decodes.append(torch.argmax(o))
            
        pdb.set_trace()
        return decodes

… but when I call learn.predict, I get the predicted for both up top … but it only shows the decoded prediction and predicted probabilities for the first CategoryBlock:

(('11', '12'),
 tensor([11]),
 tensor([[1.3416e-06, 2.7561e-06, 2.0264e-07, 4.5966e-07, 3.0259e-07, 3.0772e-07,
          2.1610e-08, 1.3417e-06, 7.8527e-04, 6.6076e-05, 2.2832e-03, 9.9590e-01,
          2.2626e-04, 1.3134e-05, 9.1205e-05, 1.3518e-06, 2.8763e-04, 2.3588e-04,
          1.8210e-06, 8.9610e-05, 1.1771e-05, 1.3414e-06, 1.3472e-06, 1.1855e-08,
          1.2746e-08, 1.2138e-08, 1.2272e-08, 1.2370e-08, 1.2658e-08, 1.2637e-08,
          1.1951e-08, 1.2288e-08, 1.2506e-08, 1.2413e-08, 1.1554e-08, 1.1504e-08,
          1.1967e-08, 1.1925e-08, 1.1473e-08, 1.1469e-08, 1.1630e-08, 1.1900e-08,
          1.2120e-08, 1.1902e-08, 1.2361e-08, 1.2250e-08, 1.2251e-08, 1.2102e-08,
          1.2098e-08, 1.2150e-08, 1.2471e-08, 1.2385e-08, 1.2323e-08, 1.2595e-08,
          1.2993e-08, 1.2916e-08, 1.2685e-08, 1.2260e-08, 1.2347e-08, 1.2912e-08,
          1.2739e-08, 1.2191e-08, 1.2021e-08, 1.2334e-08, 1.2940e-08, 1.2250e-08,
          1.1896e-08, 1.2154e-08, 1.2158e-08, 1.2634e-08, 1.2217e-08, 1.2408e-08,
          1.2470e-08, 1.2780e-08, 1.2476e-08, 1.2416e-08, 1.2483e-08, 1.2583e-08,
          1.2876e-08, 1.2409e-08, 1.2619e-08, 1.2924e-08, 1.3212e-08, 1.3309e-08,
          1.2713e-08, 1.2469e-08, 1.2866e-08, 1.3866e-08, 1.3434e-08, 1.2286e-08,
          1.1802e-08, 1.2468e-08, 1.1986e-08, 1.0836e-08, 1.1613e-08, 1.1390e-08,
          1.1968e-08, 1.1486e-08, 1.1405e-08, 1.1987e-08, 1.1625e-08, 1.1870e-08,
          1.1929e-08, 1.1302e-08, 1.1528e-08, 1.2066e-08, 1.1877e-08, 1.1442e-08,
          1.1812e-08, 1.1756e-08, 1.2398e-08, 1.1848e-08, 1.1815e-08, 1.2025e-08,
          1.2538e-08, 1.2671e-08, 1.1829e-08, 1.1684e-08, 1.1311e-08, 1.1475e-08,
          1.0819e-08, 1.0401e-08, 1.1252e-08, 1.1124e-08, 1.1321e-08, 1.1405e-08,
          1.1024e-08, 1.1901e-08]]))

and I think it is because of this line in learn.predict:

res = dec_targ,dec_preds[0],preds[0]

Is there I can/should adjust the results for it to work in learn.predict … or is learn.predict not going to work here?

FYI - learn.get_preds returns this:

([tensor([[1.3416e-06, 2.7561e-06, 2.0264e-07, 4.5966e-07, 3.0259e-07, 3.0772e-07,
           2.1610e-08, 1.3417e-06, 7.8527e-04, 6.6076e-05, 2.2832e-03, 9.9590e-01,
           2.2626e-04, 1.3134e-05, 9.1205e-05, 1.3518e-06, 2.8763e-04, 2.3588e-04,
           1.8210e-06, 8.9610e-05, 1.1771e-05, 1.3414e-06, 1.3472e-06, 1.1855e-08,
           1.2746e-08, 1.2138e-08, 1.2272e-08, 1.2370e-08, 1.2658e-08, 1.2637e-08,
           1.1951e-08, 1.2288e-08, 1.2506e-08, 1.2413e-08, 1.1554e-08, 1.1504e-08,
           1.1967e-08, 1.1925e-08, 1.1473e-08, 1.1469e-08, 1.1630e-08, 1.1900e-08,
           1.2120e-08, 1.1902e-08, 1.2361e-08, 1.2250e-08, 1.2251e-08, 1.2102e-08,
           1.2098e-08, 1.2150e-08, 1.2471e-08, 1.2385e-08, 1.2323e-08, 1.2595e-08,
           1.2993e-08, 1.2916e-08, 1.2685e-08, 1.2260e-08, 1.2347e-08, 1.2912e-08,
           1.2739e-08, 1.2191e-08, 1.2021e-08, 1.2334e-08, 1.2940e-08, 1.2250e-08,
           1.1896e-08, 1.2154e-08, 1.2158e-08, 1.2634e-08, 1.2217e-08, 1.2408e-08,
           1.2470e-08, 1.2780e-08, 1.2476e-08, 1.2416e-08, 1.2483e-08, 1.2583e-08,
           1.2876e-08, 1.2409e-08, 1.2619e-08, 1.2924e-08, 1.3212e-08, 1.3309e-08,
           1.2713e-08, 1.2469e-08, 1.2866e-08, 1.3866e-08, 1.3434e-08, 1.2286e-08,
           1.1802e-08, 1.2468e-08, 1.1986e-08, 1.0836e-08, 1.1613e-08, 1.1390e-08,
           1.1968e-08, 1.1486e-08, 1.1405e-08, 1.1987e-08, 1.1625e-08, 1.1870e-08,
           1.1929e-08, 1.1302e-08, 1.1528e-08, 1.2066e-08, 1.1877e-08, 1.1442e-08,
           1.1812e-08, 1.1756e-08, 1.2398e-08, 1.1848e-08, 1.1815e-08, 1.2025e-08,
           1.2538e-08, 1.2671e-08, 1.1829e-08, 1.1684e-08, 1.1311e-08, 1.1475e-08,
           1.0819e-08, 1.0401e-08, 1.1252e-08, 1.1124e-08, 1.1321e-08, 1.1405e-08,
           1.1024e-08, 1.1901e-08]]),
  tensor([[2.2876e-04, 9.4109e-06, 4.9377e-07, 1.4674e-07, 9.0005e-07, 3.4732e-07,
           3.5992e-07, 2.2883e-04, 2.5463e-06, 1.4216e-05, 4.6438e-05, 2.6314e-04,
           9.9662e-01, 4.6981e-06, 5.3002e-04, 2.2375e-04, 2.3716e-06, 5.1291e-05,
           1.4621e-06, 2.1887e-04, 1.0977e-03, 2.2882e-04, 2.2736e-04, 2.4852e-08,
           2.2023e-08, 2.3851e-08, 2.3955e-08, 2.3305e-08, 2.2701e-08, 2.3359e-08,
           2.4459e-08, 2.3732e-08, 2.2428e-08, 2.2973e-08, 2.5566e-08, 2.6974e-08,
           2.5110e-08, 2.5118e-08, 2.6992e-08, 2.6129e-08, 2.5796e-08, 2.5224e-08,
           2.4452e-08, 2.5722e-08, 2.5144e-08, 2.3761e-08, 2.4868e-08, 2.5882e-08,
           2.5522e-08, 2.4944e-08, 2.3564e-08, 2.5032e-08, 2.4777e-08, 2.4724e-08,
           2.3254e-08, 2.3068e-08, 2.4903e-08, 2.5857e-08, 2.5268e-08, 2.3349e-08,
           2.4021e-08, 2.6718e-08, 2.7491e-08, 2.6596e-08, 2.4762e-08, 2.6999e-08,
           2.8644e-08, 2.6516e-08, 2.6208e-08, 2.4179e-08, 2.7385e-08, 2.6833e-08,
           2.6392e-08, 2.5267e-08, 2.7138e-08, 2.7027e-08, 2.6365e-08, 2.4985e-08,
           2.4271e-08, 2.6721e-08, 2.6289e-08, 2.5289e-08, 2.4003e-08, 2.4961e-08,
           2.5628e-08, 2.6171e-08, 2.3873e-08, 2.3899e-08, 2.3877e-08, 2.6565e-08,
           2.8332e-08, 2.6298e-08, 2.8172e-08, 3.3120e-08, 2.8373e-08, 3.0354e-08,
           2.5455e-08, 2.7390e-08, 3.1623e-08, 3.0971e-08, 3.0277e-08, 4.0575e-08,
           4.9333e-08, 3.3486e-08, 2.9178e-08, 2.4328e-08, 2.6077e-08, 2.8753e-08,
           2.9154e-08, 2.6456e-08, 2.3921e-08, 2.5833e-08, 2.5386e-08, 2.4493e-08,
           2.2474e-08, 2.2999e-08, 2.4693e-08, 2.8103e-08, 2.8947e-08, 2.7616e-08,
           3.1073e-08, 3.3474e-08, 2.6319e-08, 2.9014e-08, 2.7345e-08, 2.6188e-08,
           3.0943e-08, 3.1315e-08]])],
 None,
 [tensor([11]), tensor([12])])

Thanks - wg

Mmm, yes it should probably be something that only takes the 0-th element if it’s not listy. I’d have to get a serious look into this to be sure.

1 Like