Fastai v2 - DataBunch.create equivalent

petrhrobar · January 15, 2022, 8:45am

I found a nice repo that uses fastai to train costume models on costume time series data.

However, it uses an older version of fastai and thus some methods do not work.

I would like to ask how to replicate this small example: lstm-pytorch/flights-lstm.ipynb at master · master0fcha0s/lstm-pytorch · GitHub, in fastai v2.

# Costum dataloader 

from torch.utils.data import Dataset, DataLoader
class DS(Dataset):
    def __init__(self, X_train, y_train):
        self.X_train = X_train
        self.y_train = y_train

    def __len__(self):
        return len(self.y_train)

    def __getitem__(self, idx):
        data = self.X_train[idx, :]
        labels = self.y_train[idx, :]
        return data, labels

class Model(nn.Module):
    def __init__(self, input_size, hidden_size=100):
        super().__init__()
        input_size = input_size

        self.lstm1 = nn.LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=1, batch_first=True, bidirectional=False)
        self.lstm2 = nn.LSTM(hidden_size, hidden_size*2, num_layers=1, batch_first=True, bidirectional=False)
        self.fc = nn.Linear(hidden_size*2, 1)
            
    def forward(self,x):

        # print(x.shape)
        x, _ = self.lstm1(x)
        x, _ = self.lstm2(x)
        
        # print(x.shape)
        x = self.fc(x)
        
        # print(x.shape)
        return x


model = Model(input_size=len(train_cols))
db = DataBunch.create(DS(X_train, y_train), valid_ds=None, bs=8)
learn = Learner(db, model, loss_func=nn.MSELoss())

Then, calling

learn.lr_find()

does not work - ‘Model’ object has no attribute ‘lr_find’

zonkyo · February 9, 2022, 1:29pm

@petrhrobar Hey there,

the DataBunch has been renamed to DataLoader since it is named like this in other libraries as well and not a new concept.

Cheers!

petrhrobar · February 9, 2022, 2:49pm

Thanks

But even when I modify it

db = DataLoader(DS(X_train, y_train))
learn = Learner(db, model, loss_func=nn.MSELoss())
# learn.callbacks.append(SaveModelCallback(learn, name=f'model_fold_{index}'))
learn.lr_find()

I still get the same error. Essentially I am asking how to recreate the notebook in the newest version fastai. a Custom data loader and model inside LEARNER object

I even followed tutorial here Tutorial - Using fastai on a custom new task | fastai

dataloader = DS(X_train, y_train)

from torch.utils.data import DataLoader

dl = DataLoader(dataloader, batch_size = 20)

from fastai.data.core import DataLoaders

from fastai.learner import Learner

dls = DataLoaders.from_dsets(dl)

learn = Learner(dls, model, loss_func=nn.MSELoss())
# learn.callbacks.append(SaveModelCallback(learn, name=f'model_fold_{index}'))

learn.lr_find()
learn.recorder.plot()

and still the same problem.

Many thanks beforehand for your help

zonkyo · February 9, 2022, 3:51pm

So, I am talking only about my experience which was more like “hm…this error means I forgot something” , so bear with me!

The lr_find does not require it, but still expects you to put in a DataLoaders object. The DS you are using implies only one set, so you would be required to use a second set as test.

Based on the code in the notebook linked
X_train = np.array(list(train_df.groupby(‘group’).apply(lambda x: x[train_cols].values))).astype(np.float32) y_train = np.array(list(train_df.groupby(‘group’).apply(lambda x: x[test_cols].values))).astype(np.float32)

requires you to add the same for X_test, y_test and then you can create the appropriate DataLoaders object with
dls = DataLoaders.from_dsets(DS(X_train, y_train), DS(X_test, y_test))

This worked for me then with a completly different data set, but it should work for the attempt with the flights as well.

muellerzr · February 9, 2022, 11:02pm

You should read this tutorial. You’re probably missing a needed import: