Trying to wrap my head around this like of code …
[attrgetter('text'), Tokenizer.from_df(txt_cols), Numericalize(vocab=lm_vocab)]
Typically, when I read the transforms pipeline, I read it as, “apply this transform, and then this transform on the result of the previous transform, etc…”
attrgetter('text') operates on the result of
Tokenizer.from_df(txt_cols) … rather than vice-versa. It feels funny/odd to me.
Is there a better way to read the above pipeline?