Suggest using `.diff()` and `.square()` where appropriate

Disclaimer, I’m quite new to Python/ML, be kind if I’m suggesting something dumb :slight_smile:

In the code for RNNRegularizer, the TAR is taken as the difference along dimension 1 with h[:,1:] - h[:,:-1].

After wrapping my head around what this actually did, I was curious to see if it was any different to using the .diff method like h.diff(dim=1).

On a CPU, no difference, but on the GPU it is slightly faster.

While I was at it, I also tried comparing .pow(2) with .square(), wondering if there was some hardware-level acceleration done for the latter. Tests show that it is quite a bit faster.

The above compares these three snippets, 100 runs each, on a GPU:

res1 = (h[:, 1:] - h[:, :-1]).float().pow(2).mean()
res2 = h.diff(dim=1).float().pow(2).mean()
res3 = h.diff(dim=1).float().square().mean()

IMO using .diff() is easier on the eye/brain than the indexing method, with mild performance benefit to boot.

And it seems like a global search/replace for .pow(2).square() might eek out some more ‘fast’ (and reads nicer, says me).

But perhaps these operations are such a tiny part of any real training process that it’s not worth the effort. Or maybe this is just on my machine and not universal?

1 Like

Great find! It’s definitely more readable, and likely faster on other peoples’ GPUs too. Would you be open to making a PR?

I’ll pass on the PR, sorry. I’ve got a full-on study schedule and the fastai codebase looks particularly unusual/daunting.

1 Like