Need help understanding error: the label [#% increased Armour_na] is not in the [columns]

I’m working with a data set combining two data frames before feeding it into the data block API. Becuase I’m combining the two, most columns will have a sizable number of NaN values. When I try to run this:

dep_var = 'y'
cat_vars = []
cont_vars = []

for i in list(both_raw.columns):
    if both_raw[i].nunique() < 50:
        cat_vars.append(i)
    else:
        if i != dep_var:
            cont_vars.append(i)
        
procs = [Normalize, Categorify, FillMissing]

data = (TabularList.from_df(both_raw, cat_names=cat_vars, cont_names=cont_vars, procs=procs)
                   .random_split_by_pct(valid_pct=.2, seed=456)
                   .label_from_df(col=dep_var, label_cls=FloatList)
                   .databunch())

I get the following error:

KeyError: 'the label [#% increased Armour_na] is not in the [columns]'

This is a column that the procs is adding, so I’m not sure how resolve. Let me know what else would be helpful to post here.

I’m having the same issue. Did you ever find a fix?

Nope, but to be fair, I haven’t worked on that project in a while :wink: