Wiki thread: lesson 1

fastai1 · September 29, 2018, 7:43am

When you have several date columns in a dataframe, do you need to pass them all to the date_part function?

If so, is there way to iterate through each of the columns to find which are of the date datatype and convert them to codes?

I have tried this on two columns of my df, but I get an error in return:

for col in sliced_list:
    df_raw.col 
AttributeError: 'DataFrame' object has no attribute 'col'

Or else, if I try:

for col in sliced_list:
    df_raw.columns.col
AttributeError: 'Index' object has no attribute 'col'

Is there an easy way to iterate through a dataframe’s columns?

ramin · September 29, 2018, 11:05pm

Error While executing: conda env update
I followed the exact instruction, but got an error at the end of conda env update.
Here is the error:
Exception:
Traceback (most recent call last):
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2869, in _dep_map
return self.__dep_map
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2663, in getattr
raise AttributeError(attr)
AttributeError: _DistInfoDistribution__dep_map

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/packaging/requirements.py”, line 93, in init
req = REQUIREMENT.parseString(requirement_string)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 1632, in parseString
raise exc
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 1622, in parseString
loc, tokens = self._parse( instring, 0 )
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 1379, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 3395, in parseImpl loc, exprtokens = e._parse( instring, loc, doActions )
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 1383, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pyparsing.py”, line 3183, in parseImpl raise ParseException(instring, loc, self.errmsg, self)
pip._vendor.pyparsing.ParseException: Expected stringEnd (at char 33), (line:1, col:34)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2949, in init
super(Requirement, self).init(requirement_string)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/packaging/requirements.py”, line 97, in init
requirement_string[e.loc:e.loc + 8]))
pip._vendor.packaging.requirements.InvalidRequirement: Invalid requirement, parse error at “’; extra '”

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_internal/basecommand.py”, line 141, in main
status = self.run(options, args)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_internal/commands/install.py”, line 299, in run
resolver.resolve(requirement_set)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_internal/resolve.py”, line 102, in resolve
self._resolve_one(requirement_set, req)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_internal/resolve.py”, line 306, in _resolve_one
set(req_to_install.extras) - set(dist.extras)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2826, in extras
return [dep for dep in self._dep_map if dep]
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2871, in _dep_map
self.__dep_map = self._compute_dependencies()
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2881, in _compute_dependencies
reqs.extend(parse_requirements(req))
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2942, in parse_requirements
yield Requirement(line)
File “/home/ramin/anaconda3/envs/fastai/lib/python3.6/site-packages/pip/_vendor/pkg_resources/init.py”, line 2951, in init
raise RequirementParseError(str(e))
pip._vendor.pkg_resources.RequirementParseError: Invalid requirement, parse error at “’; extra '”

Borz · September 29, 2018, 11:29pm

A lot of people’ve been having this issue recently. This (thread) might work.

ramin · September 30, 2018, 12:54am

looks like avoiding these instructions and just doing a pip install fastai works fine!

grimknight · September 30, 2018, 2:14am

Where can I find the previous Machine Learning Courses using Pytorch and Keras? I had to look something up and they aren’t featured on fast.ai website. Please help ASAP

Chris_Palmer · October 1, 2018, 7:47am

Found the answer to my question is Yes - we are in the middle of a transition, and that I needed to re-configure my fastai symbolic link to point to …/…/old/fastai

See Moving fastai folder

Are we in-between major changes and can no longer use the fastai repo to run the ML notebooks against? I just did a git pull, now I cannot anymore run the ML lesson 1 notebook because structured.py has disappeared. I notice that its located in github under fastai/old/fastai - so it looks like its on its way out and the notebook hasn’t been updated to reflect a new approach… Should I reset to the files from the old/fastai or hang tight for an updated notebook?

rexmadden · October 1, 2018, 12:15pm

Did anyone see the “card” in the YouTube video? I didn’t see anything. Jeremy said there would be one around 1:28.

yassam · October 1, 2018, 10:39pm

Which machine do you recommend to do this on?

I tried using the paperspace option recommended from lesson 1 of the DL1 course, and I get an error with the jupyter kernel restarting each time I call read_feather() followed by proc_df()

No luck running this on my own laptop (macOS): conda env update fails with a bunch of pip install errors

gwallison · October 2, 2018, 6:47pm

I am also getting a Jupyter kernel crash on my paperspace machine when I try to run

df, y, nas = proc_df(df_raw, ‘SalePrice’)

in lesson1-rf. Any recommendations for what I should try? THanks!

sabzo · October 3, 2018, 1:12am

It’s also worth noting that at least for the Notebook from Lesson 1 it can be run on one’s personal machine

yassam · October 3, 2018, 8:35pm

Not on a Mac. Haven’t been able to run “conda env update” on macOs without errors (including using the switch ‘-f environment-cpu.yml’)

sabzo · October 3, 2018, 8:56pm

I don’t think that’s because of the Notebook. You’re having issues with conda env update, which should work on mac and other *nix systems.

In terms of the FastAI library code and training the random forest regression – you shouldn’t have issues on a modern laptop/computer.

meavinash100 · October 4, 2018, 9:41am

I am also encountering the same issue. Did you manage to resolve it?

gwallison · October 4, 2018, 10:21am

No, I’ve had no luck on my paperspace machine with keeping Jupyter from crashing on the proc_df() function in lesson1-rf.
I’ve tried conda updates and git pulls but I don’t know how to proceed so I’m working on crawling through it on my laptop. Hoping someone has some ideas!

meavinash100 · October 4, 2018, 11:39am

Hi All,

When running the following part of the “lesson1-rf” notebook, the kernal is restarting:

df, y, nas = proc_df(df_raw, 'SalePrice')

I am using paperspace machine for the execution. Please let me know if someone has a solution for this. As this is preventing at lease 3 of us in forum from making progress.

Thanks,
Avinash

jeremy · October 4, 2018, 12:09pm

Oops - thanks for the reminder

gwallison · October 6, 2018, 3:35pm

I am still having the problem of the kernel restarting at the proc_df line in “lesson1-rf” on my paperspace machine. The crash appears to be triggered by the 4th line of code in proc_df:

else: df = df.copy()

Does that ring any bells with anyone?

I’m not having any problems running the notebook on my (v. slow) notebook, so I am slogging ahead with the lesson.

Thanks for any help!

gwallison · October 6, 2018, 4:22pm

Looks like it is the ‘feathered’ version of the dataframe that is causing the crash.

For now, instead of saving as feather format. I’m using a pickled version and it seems to work.:
At the end of Initial processing

os.makedirs(‘tmp’, exist_ok=True)
df_raw.to_pickle(‘tmp/bulldozers-pkl’)

then, in pre-processing:

df_raw = pd.read_pickle(‘tmp/bulldozers-pkl’)

Also, I reinstalled feather with:

conda install -c conda-forge feather-format

and that seems to have eliminated the problem (though there are new deprecation warnings).

meavinash100 · October 6, 2018, 6:59pm

Thanks for sharing the workaround. I will try this and see if this works for me.

fastai1 · October 7, 2018, 1:45pm

How does one submit to Kaggle for example in the House Prices competition?

Someone earlier answered by linking to the 3rd lesson of DL1, but it does not help much for the machine learning category.

Once I have my model which predicts ‘SalePrice’, what do I have to do to get the predictions for each house and save them to a CSV file along with the houses’ ids?