vikassb
(Vikas Singh Bhadouria)
July 5, 2019, 4:07am
1
We have a numerical dataset .
Sample :
Then we tried to create a tabular databunch using the same .
Code:
data = TabularList.from_df(df,cont_names=cont_names).split_subsets(train_size=0.8, valid_size=0.2, seed=42).label_from_df(cols=dep_var, label_cls=FloatList).databunch()
We got the databunch as expected .
When we look at x,Y of one batch of data :
As you can see the in our x we are getting an extra tensor of 0’s .
As a result when we are trying to use this databunch in a NN ,we are getting error :
Error : forward() takes 2 positional arguments but 3 were given
Kindly explain the reason and way to correct our databunch .
1 Like
muellerzr
(Zachary Mueller)
July 5, 2019, 4:14am
2
Aren’t you trying to show a batch via the dataframe? If so you want .show_batch(). Else when we create a databunch for our Learner, everything is in tensors, hence what we are seeing. Am I understanding your question correctly?
vikassb
(Vikas Singh Bhadouria)
July 5, 2019, 5:32am
3
Hi @muellerzr ,
thanks for the reply.
i updated the question .take a look .hope this time i explained my issue in better terms .
TabularList
is made to work with a tabular model that always expects two inputs (one categorical, one continuous), which is why you have this tensor of zeros (corresponding to the categorical input). Just adjust your model to forget that first tensor, or use a custom ItemList
(it’s very likely FloatList as inputs would work).
This tabular thing will be more polished in v2 but for now that’s the workarounds I can think of.
2 Likes
vikassb
(Vikas Singh Bhadouria)
July 8, 2019, 5:01am
5
Thanks @sgugger .
Even i try to ignore the tensor of zeros , how gonna pass it through Conv1D as it need 3 D input .
As databunch is giving us tuples .
I tried to create databunch using the following code :
import numpy
class ArrayDataset(Dataset):
"Sample numpy array dataset"
def __init__(self, x, y):
self.x, self.y = x, y
self.c = 2 # binary label
def __len__(self):
return len(self.x)
def __getitem__(self, i):
return self.x[i], self.y[i]
x=df.iloc[:,:-1]
Y=df.iloc[:,-1]
#split training validation
training_size = int(0.8* x.shape[0])
training_datas = x.iloc[:training_size,:]
training_labels = Y.iloc[:training_size]
validation_datas = x.iloc[training_size:,:]
validation_labels = Y.iloc[training_size:]
train=training_datas.to_numpy()
Y=training_labels.to_numpy()
test_Y=validation_labels.to_numpy()
test=validation_datas.to_numpy()
train_ds, valid_ds = ArrayDataset(train, Y), ArrayDataset(test, test_Y)
data = DataBunch.create(train_ds, valid_ds, bs=60, num_workers=1)
The generated databunch is 2D (as expected )but Conv1d needs 3 D input
Even after reshaping the issue remains .
With error :
TypeError: conv1d(): argument 'input' (position 1) must be Tensor, not tuple
How to convert that databunch ,so that it can be passed to Conv1d.
1 Like
remapears
(Aymane Ahajjam)
January 15, 2020, 5:00pm
6
did you find a solution to this problem?
vikassb
(Vikas Singh Bhadouria)
January 21, 2020, 11:21am
7
Hi @remapears ,
The issue with Tabularlist is, it always expect two inputs (Categorical, continuous).
As suggested by Sgugger above, we used Custom Itemlist.
1 Like