Help solving multicategory tabular Kaggle challenge

I’m trying to use fastai to compete on

I believe I’ve managed to create the required DataBlock but unable to create a learner.
Here’s what I’ve done so far:

!pip install -Uqq fastbook

import fastbook


from fastbook import *

#Load files from drive

from google.colab import drive


train  = pd.read_csv('/content/drive/My Drive/train_features.csv')

test =  pd.read_csv('/content/drive/My Drive/test_features.csv')

targets = pd.read_csv('/content/drive/My Drive/train_targets_scored.csv')

#Remove used ID

X_t = train.drop('sig_id',axis=1)

X_v = test.drop('sig_id',axis=1)

y_t = targets.drop('sig_id',axis=1)


X_t.cp_type = X_t.cp_type.apply(lambda x: 1 if x=='trt_cp' else 0)

X_v.cp_type = X_v.cp_type.apply(lambda x: 1 if x=='trt_cp' else 0)

X_t.cp_dose = X_t.cp_dose.apply(lambda x: 1 if x=='D1' else 0)

X_v.cp_dose = X_v.cp_dose.apply(lambda x: 1 if x=='D1' else 0)

X_t.cp_time = X_t.cp_time/72

X_v.cp_time = X_v.cp_time/72

#preperation for DataBlock

Xy_t = pd.concat([X_t,y_t],axis=1)

#Create Datablock

def get_x(r):

  return tensor(r[0:X_t.shape[1]])

def get_y(r):

  return tensor(r[(X_t.shape[1]+1):])

dblock = DataBlock( get_x= get_x,get_y = get_y)

dls = dblock.datasets(Xy_t)

I’ve tried many ways to create a learner but failed every time. Here’s an example:


AttributeError Traceback (most recent call last)
in ()
----> 1 learn=tabular_learner(dls)

3 frames
/usr/local/lib/python3.6/dist-packages/fastcore/ in gather_attrs(o, k, nm)
163 att = getattr(o,nm)
164 res = [t for t in att.attrgot(k) if t is not None]
–> 165 if not res: raise AttributeError(k)
166 return res[0] if len(res)==1 else L(res)

AttributeError: train_ds

The Learner expects an instance of the DataLoaders class and you are passing Datasets

Also, tabular does not use the DataBlock api. You should look into TabularPandas or the few examples using TabularPandas in a Kaggle Kernel for that competition.