How to get the mapping for multi-class target names

After I create the TabularPanda object, my categorical target variable fed into y_names is converted to an integer index. How do I get out the mapping from name to id? Here’s my best guess, can anyone confirm?

to = TabularPandas(df, procs, cat, cont, y_names=dep_var, splits=splits)

name_to_id = {name: i for i,name in enumerate(df.attack.unique())}

# {'normal': 0,
#  'neptune': 1,
#  'warezclient': 2,
#  'ipsweep': 3,
#  'portsweep': 4,
#  'teardrop': 5,
#  'nmap': 6,
#  'satan': 7,
#  'smurf': 8,
#  'pod': 9,
#  'back': 10,
#  'guess_passwd': 11,
#  'ftp_write': 12,
#  'multihop': 13,
#  'rootkit': 14,
#  'buffer_overflow': 15,
#  'imap': 16,
#  'warezmaster': 17,
#  'phf': 18,
#  'land': 19,
#  'loadmodule': 20,
#  'spy': 21,
#  'perl': 22}

They live in to.procs.classes and you should find your column attack in that dictionary:

to.proc.classes['attack']

if it’s for your dependent variable IIRC it lives in to.categorize.classes or to.categorize.vocab

2 Likes

Thanks, that’s different than what I had in the above code.