Tips for training a large Tabular Dataset?

I was able to get it to train! It was the embedding sizes causing the problem because I had a column with almost 8 million unique values! I guess the embedding size is not by default set to the min(50, (unique_val+1)/2) that prior fastai versions had due to learner reloading issues: Loading saved TabularModel fails due to embeddings

1 Like