class ColumnarDataset(Dataset):
def __init__(self, cats, conts, y):
n = len(cats[0]) if cats else len(conts[0])
self.cats = np.stack(cats, 1).astype(np.int64) if cats else np.zeros((n,1))
self.conts = np.stack(conts, 1).astype(np.float32) if conts else np.zeros((n,1))
self.y = np.zeros((n,1)) if y is None else y #y[:,None]
The last line … converts y to a vector, which works great for regression but not classification. Before I submit a PR, just want to make sure I’m not messing something up because I’m not understanding the implications of how this will affect code elsewhere.
Ah well spotted. I needed to make it a column vector since otherwise when it is indexed, it creates a scalar of the wrong type. So you should test any PR still works on the existing lessons.
I tested both a regression and multi-class classification example I worked up, and everything worked with just the “y”. If there are lessons that were blowing up without things as is, it would help me make sure to mitigate me blowing them up again.