my question pertains to the following function:

def most_by_mask(mask, mult):

idxs = np.where(mask)[0]

return idxs[np.argsort(mult * probs[idxs])[:4]]

particularly the last line of the function.though i understand the purpose behind this function,i am struggling to understand why the last line wasn’t just:

return np.argsort(mult * probs[idxs])[:4]

I think it is because idxs contains shuffled ids of the data (with the valset split out)

your line would give you the indeces of the probs array as indexed in there. you then apply that as a mask/filter back to idxs to get the original ids at those positions

i’m still cofused

sorry but i couldn’t get your explanation.

i mean,if it was applied as a mask,than the last line should be:

return np.where(idxs==np.argsort(mult*probs[idxs])[:4])

`print`

is your friend!

In these cases if you don’t understand something, just insert a few print statements into the function and see if that helps you, as you then can compare the actual contents of the variables instead of thinking about the theory behind how it works. At least for me that helps most.

in this case like this:

```
def most_by_mask(mask, mult):
idxs = np.where(mask)[0]
print('idxs:',idxs)
original = idxs[np.argsort(mult * probs[idxs])[:4]]
print('original:', original)
changed = np.argsort(mult * probs[idxs])[:4]
print('changed:', changed)
print(f'idxs at position {changed[0]}: {idxs[changed[0]]}')
return original
```

If you then run the dogs example (there it gets much clearer) you will see, that idx doesnt start at 0 but at 1000, and you get the following output of that function:

```
idxs: [1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1015 1016 1017 1018 1019 1020 1021
1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 ...
original: [1120 1392 1726 1613]
changed: [118 389 719 607]
idxs at position 118: 1120
```

never thought of doing that,thanks a lot that really helped me clear the confusion