I am still having the problem of the kernel restarting at the proc_df line in “lesson1-rf” on my paperspace machine. The crash appears to be triggered by the 4th line of code in proc_df:
else: df = df.copy()
Does that ring any bells with anyone?
I’m not having any problems running the notebook on my (v. slow) notebook, so I am slogging ahead with the lesson.
How does one submit to Kaggle for example in the House Prices competition?
Someone earlier answered by linking to the 3rd lesson of DL1, but it does not help much for the machine learning category.
Once I have my model which predicts ‘SalePrice’, what do I have to do to get the predictions for each house and save them to a CSV file along with the houses’ ids?
i have added Kaggle kernel for lesson 1 link into lesson resources section so that anyone can run lesson 1 by forking this kernel (need free Kaggle account only).
Doesn’t replace the NaN values by the median may incur in looakahead bias? Shouldn’t it be better to do fillna(method=‘pad’) or replace by the rolling median? Or is the effect negligible?
I am getting a Memory error when trying to fit lesson 1.Any ideas?
I am running whole thing in AWS EC2 micro instance.
MemoryError Traceback (most recent call last)
in
1 m = RandomForestRegressor(n_jobs=-1)
----> 2 m.fit(df, y)
3 m.score(df,y)
Solved it, when you are using free tier you can’t expect to train on whole dataset on so low ram, try fitting on 20-50k samples.run latter parts of code where jeremy prototypes on small sets.
When we use proc_df it returns x,y,nas. I read the documentation of nas and it says nas: returns a dictionary of which nas it created, and the associated median. Can someone explain what could be a future use of nas
Moring,
I have quick question: when I try to Launch instance in AWS and search for Ubuntu free tier, I only see three option “Ubuntu Server 18.04 LTS (HVM) SSD Volume Type; .NET Core 2.1 with Ubuntu Server 18.04 - Version 1.0; Ubuntu Server 16.04 LTS (HVM) SSD Volume Type”, which one I should pick for Ubuntu for this course? Thanks!
i am using below one and it is working fine till now.
Ubuntu Server 18.04 LTS (HVM), SSD Volume Type - ami-0d773a3b7bb2bb1c1
You need to select micro instance and storage as ssd 30gb to be in free tier.Also do not select any paid services like extra monitoring static ip etc else you will incur costs.
Btw keep in mind ,when you run things free tier micro you will never be able to train on whole dataset , so you need to keep number of samples low every time.
Thanks! BTW, does anyone use “fastai-part1v2-p2 - ami-8c4288f4” before? It is free tier as well, what the difference comparing "Ubuntu Server 18.04 LTS "?
As Jeremy said in lecture, get_dummies will create three cols with with 1 and 0 as values, but you will not find any order among these categories. In the bulldozer example, the category has an order like High > Medium > Low so we need an ordered numerical representation, so train_cats does this for us marking values as 2,1 and 0 respectively.
ImportError Traceback (most recent call last)
in
----> 1 from fastai.imports import *
2 from fastai.structured import *
3
4 from pandas_summary import DataFrameSummary
5 from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
~\Miniconda3\envs\fastai\lib\site-packages\fastai_init_.py in
----> 1 from .basic_train import *
2 from .callback import *
3 from .callbacks import *
4 from .core import *
5 from .basic_data import *
~\Miniconda3\envs\fastai\lib\site-packages\fastai\basic_train.py in
1 "Provides basic training and validation with Learner"
----> 2 from .torch_core import *
3 from .basic_data import *
4 from .callback import *
5
~\Miniconda3\envs\fastai\lib\site-packages\fastai\torch_core.py in
1 “Utility functions to help deal with tensors”
----> 2 from .imports.torch import *
3 from .core import *
4
5 AffineMatrix = Tensor
~\Miniconda3\envs\fastai\lib\site-packages\fastai\imports_init_.py in
1 from .core import *
----> 2 from .torch import *
~\Miniconda3\envs\fastai\lib\site-packages\fastai\imports\torch.py in
----> 1 import torch, torch.nn.functional as F
2 from torch import ByteTensor, DoubleTensor, FloatTensor, HalfTensor, LongTensor, ShortTensor, Tensor
3 from torch import nn, optim, as_tensor
4 from torch.utils.data import BatchSampler, DataLoader, Dataset, Sampler, TensorDataset
~\Miniconda3\envs\fastai\lib\site-packages\torch_init_.py in
74 pass
75
—> 76 from torch._C import *
77
78 all += [name for name in dir(_C)
ImportError: DLL load failed: The specified module could not be found.
Anybody else facing this error please let me know.
In the first lesson of the Intro to Machine Learning Course, the Jupyter Notebook has a variable
PATH = “data/bulldozers/”
But I did not find any directory named data in the “ml1” folder.
Do we have to download a Kaggle dataset for the course or have I missed out on anything?
Thank you