TabularPandas.classes always includes #na#

I have constructed a TabularPandas object to from a DataFrame without missing values. I can check from to.xs that the underlying values indeed do not contain any NaN’s. When I use to.classes, however, #na# is included in every list of discrete levels for all the categorical columns.

Is this common behaviour?

It is! This is a holder in case any categorical variables aren’t present in your training (or validation and test) datasets :slight_smile:

FillMissing is only for continuous variables. Categorify is what’s causing this

Got it, thanks!