Confusion about the source code of train_cats(df)

(Junlin) #1

Source code:

def train_cats(df):
    for n,c in df.items():
        if is_string_dtype(c): df[n] = c.astype('category').cat.as_ordered()

And the doc goes: “Change any columns of strings…inplace.”

If so, shouldn’t the last line of the source code be:

df[n] = c.astype('category').cat.as_ordered( inplace=True)

instead of

df[n] = c.astype('category').cat.as_ordered()


(Kien Vu) #2

In this case, n is column name and c is column value. If you use in inplace=True, c will be updated to itself and can not assign to df[n].

(Junlin) #3

Got it. Thanks!:grinning:

(Fabian R) #4

Not totally in relation to the original question, but is there a good reason to call .as_ordered ?