# Lesson 6: `one_batch()` returns 3 tensors on custom data set

I’m trying to apply Rossman to my own problem where I’m trying to predict the total profit of a business on any given day.

I started with a very simple dataframe but whenever I check the first batch using `dls.one_batch()` I get 3 tensors instead of 2. The x tensor is my dataframe, the y tensor is empty and the z tensor is the total profit on the given day.

My initial dataframe has just 2 columns: “date” and “total” (total profit for the day).

I call `add_datepart` and then include all the newly added columns inside the list of categorical variables. I end up with no continuous variables since the only continuous variable is `total` which is the independent variable. When I call `one_batch()` I get the following:

``````# This looks right
x tensor([[ 4,  4,  2,  1, 25, 18,  1,  1,  1,  1,  1,  1],
[ 4,  7,  1,  2,  7, 27,  1,  1,  1,  1,  1,  1],
[ 4,  8,  1,  2, 14, 32,  1,  1,  1,  1,  1,  1],
[ 5,  9,  1,  2, 22, 38,  1,  1,  1,  1,  1,  1],
[ 3,  7,  1,  2,  6, 26,  1,  1,  1,  1,  1,  1],
....
``````
``````# I can't figure out why this is happening
y tensor([], size=(20, 0))
``````
``````# This should be the y tensor
z tensor([[22437.4707],
[22872.96],
[51414.55],
[ 6833.65],
[898375],
[343409.35],
``````

As far as I know I should only get 2 tensors, x corresponds to the input and y to the output. I can’t figure out why I’m getting this and I think it causes me to have weird training results (my predictions are negative and loss doesn’t decrease, in fact it almost overflows).

My code looks like this…

``````...
add_datepart(df, "date")
df.drop([ "Elapsed"], axis=1, inplace=True) # drop elapsed to simplify things, we already have a date column
cat_names = [
"Dayofweek",
"Week",
"Year",
"Month",
"Day",
"Dayofyear",
"Is_month_end",
"Is_month_start",
"Is_quarter_end",
"Is_quarter_start",
"Is_year_end",
"Is_year_start",
]
cont_names = [
col for col in df.columns if col not in [ "date", "total"] + cat_names
]  # cont_names is effectively empty

#At this point cat_names is 12 columns, DF has 13 columns including "total"

valid = [x for x in range(len(train), len(df))] # train is a subset of the original df

dls = TabularDataLoaders.from_df(
df,
path=".",
procs=[Categorify, FillMissing],
cat_names=cat_names,
cont_names=cont_names,
y_names="total",
bs=20,
valid_idx=valid,
)

dls = TabularDataLoaders.from_df(
df,
path=".",
procs=[Categorify, FillMissing],
cat_names=cat_names,
cont_names=cont_names,
y_names="total",
bs=20,
valid_idx=valid,
)

x, y, z = dls.one_batch()
print("x", x)
print("y", y)
print("z", z)
``````

I’d appreciate any help, thanks!

Can you show the code you used to create your `DataLoaders`?

Following the approach in chapter 9 (I was just reviewing this chapter so that’s why I’m referencing it) I would assume it’s something like the following:

``````procs = [Categorify, FillMissing]

cond = # some condition which splits early dates from later dates
train_idx = np.where( cond)[0]
valid_idx = np.where(~cond)[0]
splits = (list(train_idx),list(valid_idx))
cont,cat = cont_cat_split(df, 1, dep_var=dep_var)

to = TabularPandas(df, procs, cat, cont, y_names=dep_var, splits=splits)
dls = to.dataloaders(1024)
``````

(or using TabularDataLoaders.from_df)

Also, have you tried creating a `DataLoaders` object without applying `add_datepart`? And if so were you able to get a sensible result with x and y instead of x, y, and z when calling `dls.show_batch()`?

Lastly, I’m assuming you can’t share the notebook you’re working with, but can you give a few rows of example fake data that populates your `DataFrame`?