# Tabular data - Kaggle competitions

Hi all

I want to train on tabular data with DL. I already tried a few Kaggle competitions like Rossman, Titanic, and Santander.

I am looking for recommends about other competitions using Tabular data. (and if you have fast.ai kernels for them as reference).

Thank you!
Offir

Hey Offir - how did you go on the santander data set?
There are a heap of tabular data sets around, look at the data sets available on kaggle, I think there aren’t many competitions though.

HI Kieran

This is the link for santander: https://www.kaggle.com/c/santander-customer-transaction-prediction

if you can reccomand on more datasats it will be great

Hey, I’m working on a similar problem, maybe you all could help me figure this out: I’m trying to use a tabular learner on a wide dataset, and all the values are continuous. I keep getting a “divide by zero” error, and I think it’s due to the way that I’m creating my databunch, but not sure what exactly I’m doing wrong.

``````import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from fastai.tabular import *

import os
print(os.listdir("../input"))
path = Path('../input')
``````
``````procs = [Normalize, FillMissing]
dep_var = 'target'
# valid_idx = range(len(df)-50, len(df))
df.drop(columns='id', inplace=True)

columns = df.columns

ids_test = test['id']
test.drop(columns='id', inplace = True)```
```test_list = TabularList.from_df(test, procs=procs)
data = (TabularList.from_df(df, procs=procs)
#                            .split_by_idx(valid_idx=valid_idx)
.split_by_rand_pct(.2)
.label_from_df(cols=dep_var)
.databunch(path='.'))

learn = tabular_learner(data, layers=[200,100], metrics=AUROC())
``````

This gives the error:

``````/opt/conda/lib/python3.6/site-packages/torch/nn/init.py in kaiming_uniform_(tensor, a, mode, nonlinearity)
288     fan = _calculate_correct_fan(tensor, mode)
289     gain = calculate_gain(nonlinearity, a)
--> 290     std = gain / math.sqrt(fan)
291     bound = math.sqrt(3.0) * std  # Calculate uniform bounds from standard deviation
Hey @Stephen_F, the error might be because you are not telling the model which columns contain constant values and which ones contain categorical values. You might want to add a line like `cont_names = ['age', 'gender' and so on]` or since all your columns are constant you can write `cont_names = list(df.columns)`. Then in your learner pass that as an argument like this `cont_names=cont_names`