Trying to wrap my head around this like of code …
[attrgetter('text'), Tokenizer.from_df(txt_cols), Numericalize(vocab=lm_vocab)]
Typically, when I read the transforms pipeline, I read it as, “apply this transform, and then this transform on the result of the previous transform, etc…”
But, attrgetter('text')
operates on the result of Tokenizer.from_df(txt_cols)
… rather than vice-versa. It feels funny/odd to me.
Is there a better way to read the above pipeline?