Can't get the rossman_data_clean working

(Max Yazhbin) #1

I got everything running up until the following lines of code:
for df in (joined,joined_test):
df["Promo2Since"] = pd.to_datetime(df.apply(lambda x: Week(
x.Promo2SinceYear, x.Promo2SinceWeek).monday(), axis=1).astype(pd.datetime))
df["Promo2Days"] = df.Date.subtract(df["Promo2Since"]).dt.days

Here is the error I got:

I am running ubuntu 18.04 on AWS. I see I can autocomplete pd.datetime just fine and get the documentation to show up but I still see that it doesn’t know what datetime.datetime is, I even did from datetime import datetime but that didn’t work either.

1 Like

Rossman_data_clean.ipynb got TypeError: dtype '<class 'datetime.datetime'>' not understood
(Peter Walkley) #2

I hit this too. I think the solution is to remove the ‘.astype(pd.datetime)’ part. I haven’t gone through lesson 6 yet to confirm, but from reading up it seems like that is redundant as to_datetime should be performing the type conversion already.



This is correct. notebook needs updating

1 Like

(Mike HAWKINS) #4

I hit the same error and fixed the same way. Agree with @RogerS49, the notebook should be corrected. It made me take a long long look at the stores.csv file, however. Maybe not a bad thing.
Who is authorized to fix something like this?



hi, Mike, since you seem running the rossman nb. May I ask a silly question: I cant find the train_clean data from running the following code…Maybe there’s some change in directories but I poked around the dir and didnt find it, also didnt find on github… also didnt’ find the csvs (store/weather etc.) could you help? Many thanks.

path = Config().data_path()/‘rossmann/’
train_df = pd.read_pickle(path/‘train_clean’)


(Mike HAWKINS) #6

Just to be clear, I’m running 2018 part 1, lesson 3.
In that notebook, just after the second code block in the section called ‘Create datasets’, there is a link to If you are not using that zip file, anything I say next may not apply to your situation.

I navigated to c:/users/mike01/fastai/data (yep, I’m in Windows) and created a new folder, ‘rossmann’. I moved the tgz file there and unpacked it. There are 8 csv files including ‘store’ and ‘weather’, but no ‘train_clean’.

Since train_clean is not one of the unzipped files, it must be a file created during running of one of the code cells in your notebook. When I run my notebook, it creates 2 new folders and 3 new files in the rossmann directory. But I’m doing the 2018 version and it doesn’t seem to use a ‘train_clean’ file.

I hope this is clear. If not, please tell me. I’ll try to help.

1 Like

(Lorentz Kinde) #7

I also did this and it worked! Might encounter issues further ahead though haven’t checked, but it worked to start the dataset off.


(Ajaykumaar) #9

And I downloaded the dataset but couldn’t find it. :confused: