Another treat! Early access to Intro To Machine Learning videos

ldlt · January 17, 2018, 12:37am

I’m watching lesson 2. At 33:47, Jeremy uses graphviz to draw a decision tree with draw_tree(m.estimators_[0], df_trn, precision=3).

I’m using Crestle. When I run this command, I get the error shown below. Did I miss something that I needed to do to set up Crestle?

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/graphviz/backend.py in pipe(engine, format, data, quiet)
    153             stdout=subprocess.PIPE, stderr=subprocess.PIPE,
--> 154             startupinfo=STARTUPINFO)
    155     except OSError as e:

/usr/lib/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    708                                 errread, errwrite,
--> 709                                 restore_signals, start_new_session)
    710         except:

/usr/lib/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1343                             err_msg += ': ' + repr(err_filename)
-> 1344                     raise child_exception_type(errno_num, err_msg, err_filename)
   1345                 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'dot': 'dot'

During handling of the above exception, another exception occurred:

ExecutableNotFound                        Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

/usr/local/lib/python3.6/dist-packages/graphviz/files.py in _repr_svg_(self)
    104 
    105     def _repr_svg_(self):
--> 106         return self.pipe(format='svg').decode(self._encoding)
    107 
    108     def pipe(self, format=None):

/usr/local/lib/python3.6/dist-packages/graphviz/files.py in pipe(self, format)
    123         data = text_type(self.source).encode(self._encoding)
    124 
--> 125         outs = backend.pipe(self._engine, format, data)
    126 
    127         return outs

/usr/local/lib/python3.6/dist-packages/graphviz/backend.py in pipe(engine, format, data, quiet)
    155     except OSError as e:
    156         if e.errno == errno.ENOENT:
--> 157             raise ExecutableNotFound(args)
    158         else:  # pragma: no cover
    159             raise

ExecutableNotFound: failed to execute ['dot', '-Tsvg'], make sure the Graphviz executables are on your systems' PATH

<graphviz.files.Source at 0x7eff4d873400>

cqfd · January 17, 2018, 1:28am

@ldlt I think your problem is that you need to install the Graphviz command-line program itself, not just the Python wrapper package. See here: https://pypi.python.org/pypi/graphviz, where they link to https://www.graphviz.org/download/. I’m not very familiar with Crestle though, but I think you can get access to a shell?

ldlt · January 17, 2018, 10:30pm

Yeah, I have access to a shell although it looks like I can’t apt-get install it. I guess I can build graphviz from source, but I’d rather not.

satish860 · January 18, 2018, 5:57am

Hi , I was working on a playground problem of House Regression and when I used Proc_df , I was getting the following error and google didnt really help much .

Not sure which is feature it was referring too .

ecdrid · January 18, 2018, 7:02am

It seems your data isn’t categorical …
I had got his error also but the data wasn’t categorical and I was applying proc_df

satish860 · January 18, 2018, 7:25am

Thanks @ecdrid . This is for test dataset and I have to use apply_cats before sending it to Proc_df. Corrected it now.

ecdrid · January 18, 2018, 6:24pm

Thanks

satish860 · January 19, 2018, 3:38am

I was working on house regression problem and I am 5th video in the series.

One point my test set and training set has different number of features after proc_df because of the NA fields . How to handle this situation of having different null fields between test and training dataset

deesoni · January 20, 2018, 6:58am

Is this resolved, i am also facing the same issue.

ecdrid · January 20, 2018, 7:01am

The issue is with the environment variables…
In windows I had to add the graphviz exe’s file location to the System Path and then it worked…

So something similar has to be done is Linux also.?

ecdrid · January 21, 2018, 6:24am

@ramesh
Hi all, Can someone tell me the difference in these two models or they are equivalent and just different ways of doing same thing-:

First One

import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.l1 = nn.Linear(784, 100)
        self.l2 = nn.Linear(100, 100)
        self.l3 = nn.Linear(100, 64)
        self.l4 = nn.Linear(64, 10)

    def forward(self, x):
        x = x.view(-1, 784)  # Flatten the data (n, 1, 28, 28)-> (n, 784)
        x = F.relu(self.l1(x))
        x = F.relu(self.l2(x))
        x = F.relu(self.l3(x))
        x = F.relu(self.l4(x))
        return (x)

Second One

import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
    
    def __init__(self):
        
        super(Net, self).__init__()
        
        self.fc1 = nn.Linear(784, 100) 
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(100, 100)
        self.relu2 = nn.ReLU()
        self.fc3 = nn.Linear(100, 64)
        self.relu3 = nn.ReLU()
        self.fc4 = nn.Linear(64, 10)  
    
    def forward(self, x):
        
        out = self.fc1(x)
        out = self.relu1(out)
        out = self.fc2(out)
        out = self.relu2(out)
        out = self.fc3(out)
        out = self.relu3(out)
        out = self.fc4(out)
        return out

My thinking

It seems that they are the one and the same thing but i want to understand is that what happens when we write this line model = Net().cuda(0) ?
Does this line calls model.forward() in the backened?
Also are these all same model.train(),model.eval(),model?

Thanks in Advance…

Also which one is preferred(if any)…

ecdrid · January 21, 2018, 6:28am

Found it somewhere on Stack

Its equivalent to keras model.summary() in PyTorch…

from torch.nn.modules.module import _addindent
import torch
import numpy as np
def torch_summarize(model, show_weights=True, show_parameters=True):
    """Summarizes torch model by showing trainable parameters and weights."""
    tmpstr = model.__class__.__name__ + ' (\n'
    for key, module in model._modules.items():
        # if it contains layers let call it recursively to get params and weights
        if type(module) in [
            torch.nn.modules.container.Container,
            torch.nn.modules.container.Sequential
        ]:
            modstr = torch_summarize(module)
        else:
            modstr = module.__repr__()
        modstr = _addindent(modstr, 2)

        params = sum([np.prod(p.size()) for p in module.parameters()])
        weights = tuple([tuple(p.size()) for p in module.parameters()])

        tmpstr += '  (' + key + '): ' + modstr 
        if show_weights:
            tmpstr += ', weights={}'.format(weights)
        if show_parameters:
            tmpstr +=  ', parameters={}'.format(params)
        tmpstr += '\n'   

    tmpstr = tmpstr + ')'
    return tmpstr

# Test
import torchvision.models as models
model = models.alexnet()
print(torch_summarize(model))

# # Output
# AlexNet (
#   (features): Sequential (
#     (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2)), weights=((64, 3, 11, 11), (64,)), parameters=23296
#     (1): ReLU (inplace), weights=(), parameters=0
#     (2): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#     (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)), weights=((192, 64, 5, 5), (192,)), parameters=307392
#     (4): ReLU (inplace), weights=(), parameters=0
#     (5): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#     (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((384, 192, 3, 3), (384,)), parameters=663936
#     (7): ReLU (inplace), weights=(), parameters=0
#     (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((256, 384, 3, 3), (256,)), parameters=884992
#     (9): ReLU (inplace), weights=(), parameters=0
#     (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((256, 256, 3, 3), (256,)), parameters=590080
#     (11): ReLU (inplace), weights=(), parameters=0
#     (12): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#   ), weights=((64, 3, 11, 11), (64,), (192, 64, 5, 5), (192,), (384, 192, 3, 3), (384,), (256, 384, 3, 3), (256,), (256, 256, 3, 3), (256,)), parameters=2469696
#   (classifier): Sequential (
#     (0): Dropout (p = 0.5), weights=(), parameters=0
#     (1): Linear (9216 -> 4096), weights=((4096, 9216), (4096,)), parameters=37752832
#     (2): ReLU (inplace), weights=(), parameters=0
#     (3): Dropout (p = 0.5), weights=(), parameters=0
#     (4): Linear (4096 -> 4096), weights=((4096, 4096), (4096,)), parameters=16781312
#     (5): ReLU (inplace), weights=(), parameters=0
#     (6): Linear (4096 -> 1000), weights=((1000, 4096), (1000,)), parameters=4097000
#   ), weights=((4096, 9216), (4096,), (4096, 4096), (4096,), (1000, 4096), (1000,)), parameters=58631144
# )

ecdrid · January 21, 2018, 6:29pm

This work from Tyler White on Random Forests is awesome…
Have a look at those cool visuals…(just amazing)

Notebook

jeremy · January 21, 2018, 8:32pm

FYI we already have Learner.summary in fastai

himalayas · January 23, 2018, 2:00am

Have hit this same issue on paperspace fyi.
Graphviz package shows installed and PATH shows
nda3/envs/fastai/bin:/home/paperspace/anaconda3/bin:/home/paperspace/bin:/home/paperspace/.local/bin:/home/paperspace/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(fastai) paperspace@psrxsjv90:~$ which dot
Might require modifying the PATH above.

himalayas · January 23, 2018, 2:10am

conda install graphviz did the trick …pip showed an earlier version of the package

fastai) paperspace@psrxsjv90:~$ conda install graphviz
Fetching package metadata …
Solving package specifications: .

Package plan for installation in environment /home/paperspace/anaconda3/envs/fastai:

The following NEW packages will be INSTALLED:

cairo:            1.14.8-0
graphviz:         2.38.0-5
harfbuzz:         0.9.39-2
libtool:          2.4.6-h544aabb_3
pango:            1.40.3-1
pixman:           0.34.0-hceecf20_3

The following packages will be DOWNGRADED:

dbus:             1.10.22-h3b5a359_0    --> 1.10.20-0
fontconfig:       2.12.4-h88586e7_1     --> 2.12.1-3
freetype:         2.8-hab7d2ae_1        --> 2.5.5-2
glib:             2.53.6-h5d9569c_2     --> 2.50.2-1
gst-plugins-base: 1.12.2-he3457e5_0     --> 1.8.0-0
gstreamer:        1.12.2-h4f93127_0     --> 1.8.0-0
icu:              58.2-h9c2bf20_1       --> 54.1-0
libiconv:         1.15-h63c8f33_5       --> 1.14-0
libxml2:          2.9.4-h2e8b1d7_6      --> 2.9.4-0
matplotlib:       2.1.1-py36ha26af80_0  --> 2.0.2-np113py36_0
numpy:            1.14.0-py36h3dfced4_0 --> 1.13.3-py36ha12f23b_0
pcre:             8.41-hc27e229_1       --> 8.39-1
pyqt:             5.6.0-py36h0386399_5  --> 5.6.0-py36_2
qt:               5.6.2-h974d657_12     --> 5.6.2-5

Proceed ([y]/n)? y

icu-54.1-0.tar 100% |################################################################################################| Time: 0:00:01 9.82 MB/s
libiconv-1.14- 100% |################################################################################################| Time: 0:00:00 19.82 MB/s
libtool-2.4.6- 100% |################################################################################################| Time: 0:00:00 20.11 MB/s
pixman-0.34.0- 100% |################################################################################################| Time: 0:00:00 19.72 MB/s
dbus-1.10.20-0 100% |################################################################################################| Time: 0:00:00 21.59 MB/s
libxml2-2.9.4- 100% |################################################################################################| Time: 0:00:00 15.53 MB/s
pcre-8.39-1.ta 100% |################################################################################################| Time: 0:00:00 25.47 MB/s
freetype-2.5.5 100% |################################################################################################| Time: 0:00:00 26.03 MB/s
glib-2.50.2-1. 100% |################################################################################################| Time: 0:00:00 12.68 MB/s
fontconfig-2.1 100% |################################################################################################| Time: 0:00:00 9.16 MB/s
gstreamer-1.8. 100% |################################################################################################| Time: 0:00:00 20.37 MB/s
cairo-1.14.8-0 100% |################################################################################################| Time: 0:00:00 16.87 MB/s
gst-plugins-ba 100% |################################################################################################| Time: 0:00:00 11.86 MB/s
harfbuzz-0.9.3 100% |################################################################################################| Time: 0:00:00 11.32 MB/s
qt-5.6.2-5.tar 100% |################################################################################################| Time: 0:00:05 9.02 MB/s
pango-1.40.3-1 100% |################################################################################################| Time: 0:00:00 7.99 MB/s
pyqt-5.6.0-py3 100% |################################################################################################| Time: 0:00:00 11.72 MB/s
graphviz-2.38. 100% |################################################################################################| Time: 0:00:00 14.03 MB/s
matplotlib-2.0 100% |################################################################################################| Time: 0:00:00 11.83 MB/s
(fastai) paperspace@psrxsjv90:~$ which dot
/home/paperspace/anaconda3/envs/fastai/bin/dot
(fastai) paperspace@psrxsjv90:~$

ecdrid · January 23, 2018, 10:03am

How to interpret this dendogram?
Link to nbs

Thanks in Advance…

(Trying out fast.ai lib on different datasets and planning to do without it also…)

Also there is issue with the proc_df if you are having missing values (Nas) in the test_set…

Callan99 · January 24, 2018, 1:12pm

Hi

I’m working my way through the ML course and have encountered the following issue on lesson4-mnist_sgd
‘ModuleNotFoundError: No module named ‘torchvision’’, see attached.

I have removed fastai and git cloned fastai again… but error is still triggered?

Any help greatly appreciated

Thanks

Jonathan

ecdrid · January 24, 2018, 1:14pm

pip3 install torchvision

Callan99 · January 24, 2018, 1:21pm

Tried ‘pip3 install torchvision’ but still getting the same error as install errors?