I have an issue trying to understand the
def forward(self, cats, conts): part of the movielens code. In the video lecture Jeremy says that cats / conts is used to leverage on the ColumnarData obj that we will be creating later, but how does PyTorch know what to retrieve, since we do not explicitly pass the parameters in? (How does PyTorch automatically know that
cats refers to the categorical columns within the dataframe?
I tried digging into the source code but can’t seem to find an explanation. Perhaps my understanding of how PyTorch runs
def forward is not complete – would greatly appreciate it if someone could explain this to me please! Thank you!
I’m also new in fastai but trying to give my thinking anyway.
I think Pytorch doesn’t know which data is categorical or continuous. They are that kind of data by defining in ColumnarModelData.
def from_data_frame(cls, path, val_idxs, df, y, cat_flds, bs, is_reg=True, is_multi=False, test_df=None):
((val_df, trn_df), (val_y, trn_y)) = split_by_idx(val_idxs, df, y)
return cls.from_data_frames(path, trn_df, val_df, trn_y, val_y, cat_flds, bs, is_reg, is_multi, test_df=test_df)
Where cat_flds defines the categorical data. The ColumnarModelData class inherit the ModelData and it inherits also the Dataset class in Torch
How these types of data are processed differently I’m also not fully understand. We need someone else to clarify it.