Hey,

I am currently redoing the translation part and am trying to do it in fastai v1. I want to train an Image caption generator and I want to start by training a seq2seq auto encoder. The model is supposed to learn to map a sentence into itself.

I have troubles creating the Dataset in a way that the function `pad_allocate`

accepts a batch.

The dataset was created the following way in the fastai v0.7 translate notebook:

```
def A(*a):
"""convert iterable object into numpy array"""
return np.array(a[0]) if len(a)==1 else [np.array(o) for o in a]
class Seq2SeqDataset(Dataset):
def __init__(self, x, y):
self.x, self.y = x, y
def __getitem__(self, idx):
return A(self.x[idx], self.y[idx])
def __len__(self):
return len(self.x)
```

**Problem:**

```
trn_dl = DataLoader(dataset=trn_ds, batch_size=bs, sampler=trn_sampler, collate_fn=pad_collate)
batch = next(iter(trn_dl))
```

fails with:

*can’t convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.*

I have tried to break my problem down:

`trn_ds`

is such a dataset and `[trn_ds[0]]`

gives:

```
[[array([ 9, 411, 1019, 700, 498, 1]),
array([ 9, 411, 1019, 700, 498, 1])]]
```

And `pad_collate([trn_ds[0]], pad_idx=1, pad_first=False)`

gives;

```
(tensor([[ 9, 411, 1019, 700, 498, 1]]),
tensor([[ 9, 411, 1019, 700, 498, 1]]))
```

So far so good.

However, if I try to pass more than one sentence to `pad_collate`

it fails with the same error message as above:

`[trn_ds[0], trn_ds[2]]`

is:

```
[[array([ 9, 411, 1019, 700, 498, 1]),
array([ 9, 411, 1019, 700, 498, 1])],
[array([ 51, 4386, 68, 193, 12, 107, 11, 9, 2768, 1]),
array([ 51, 4386, 68, 193, 12, 107, 11, 9, 2768, 1])]]
```

And `pad_collate([trn_ds[0], trn_ds[2]], pad_idx=1, pad_first=False)`

gives the same error:

*can’t convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.*

This line causes the error:

`tensor(np.array([trn_ds[0][1], trn_ds[1][1]]))`

Ok, it is kind of obvious that this can’t be converted to a tensor. But how do I have to construct the input to `pad_collate`

so that it accepts a batch?

Thanks in advance!

F