Another treat! Early access to Intro To Machine Learning videos

I’m watching lesson 2. At 33:47, Jeremy uses graphviz to draw a decision tree with draw_tree(m.estimators_[0], df_trn, precision=3).

I’m using Crestle. When I run this command, I get the error shown below. Did I miss something that I needed to do to set up Crestle?

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/graphviz/backend.py in pipe(engine, format, data, quiet)
    153             stdout=subprocess.PIPE, stderr=subprocess.PIPE,
--> 154             startupinfo=STARTUPINFO)
    155     except OSError as e:

/usr/lib/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    708                                 errread, errwrite,
--> 709                                 restore_signals, start_new_session)
    710         except:

/usr/lib/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1343                             err_msg += ': ' + repr(err_filename)
-> 1344                     raise child_exception_type(errno_num, err_msg, err_filename)
   1345                 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'dot': 'dot'

During handling of the above exception, another exception occurred:

ExecutableNotFound                        Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in __call__(self, obj)
    343             method = get_real_method(obj, self.print_method)
    344             if method is not None:
--> 345                 return method()
    346             return None
    347         else:

/usr/local/lib/python3.6/dist-packages/graphviz/files.py in _repr_svg_(self)
    104 
    105     def _repr_svg_(self):
--> 106         return self.pipe(format='svg').decode(self._encoding)
    107 
    108     def pipe(self, format=None):

/usr/local/lib/python3.6/dist-packages/graphviz/files.py in pipe(self, format)
    123         data = text_type(self.source).encode(self._encoding)
    124 
--> 125         outs = backend.pipe(self._engine, format, data)
    126 
    127         return outs

/usr/local/lib/python3.6/dist-packages/graphviz/backend.py in pipe(engine, format, data, quiet)
    155     except OSError as e:
    156         if e.errno == errno.ENOENT:
--> 157             raise ExecutableNotFound(args)
    158         else:  # pragma: no cover
    159             raise

ExecutableNotFound: failed to execute ['dot', '-Tsvg'], make sure the Graphviz executables are on your systems' PATH

<graphviz.files.Source at 0x7eff4d873400>

@ldlt I think your problem is that you need to install the Graphviz command-line program itself, not just the Python wrapper package. See here: https://pypi.python.org/pypi/graphviz, where they link to https://www.graphviz.org/download/. I’m not very familiar with Crestle though, but I think you can get access to a shell?

Yeah, I have access to a shell although it looks like I can’t apt-get install it. I guess I can build graphviz from source, but I’d rather not.

Hi , I was working on a playground problem of House Regression and when I used Proc_df , I was getting the following error and google didnt really help much .

Not sure which is feature it was referring too .

It seems your data isn’t categorical …
I had got his error also but the data wasn’t categorical and I was applying proc_df

Thanks @ecdrid . This is for test dataset and I have to use apply_cats before sending it to Proc_df. Corrected it now.

1 Like

Thanks

I was working on house regression problem and I am 5th video in the series.

One point my test set and training set has different number of features after proc_df because of the NA fields . How to handle this situation of having different null fields between test and training dataset

1 Like

Is this resolved, i am also facing the same issue.

The issue is with the environment variables…
In windows I had to add the graphviz exe’s file location to the System Path and then it worked…

So something similar has to be done is Linux also.?

@ramesh
Hi all, Can someone tell me the difference in these two models or they are equivalent and just different ways of doing same thing-:

First One

import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.l1 = nn.Linear(784, 100)
        self.l2 = nn.Linear(100, 100)
        self.l3 = nn.Linear(100, 64)
        self.l4 = nn.Linear(64, 10)

    def forward(self, x):
        x = x.view(-1, 784)  # Flatten the data (n, 1, 28, 28)-> (n, 784)
        x = F.relu(self.l1(x))
        x = F.relu(self.l2(x))
        x = F.relu(self.l3(x))
        x = F.relu(self.l4(x))
        return (x)

Second One

import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
    
    def __init__(self):
        
        super(Net, self).__init__()
        
        self.fc1 = nn.Linear(784, 100) 
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(100, 100)
        self.relu2 = nn.ReLU()
        self.fc3 = nn.Linear(100, 64)
        self.relu3 = nn.ReLU()
        self.fc4 = nn.Linear(64, 10)  
    
    def forward(self, x):
        
        out = self.fc1(x)
        out = self.relu1(out)
        out = self.fc2(out)
        out = self.relu2(out)
        out = self.fc3(out)
        out = self.relu3(out)
        out = self.fc4(out)
        return out

My thinking

  • It seems that they are the one and the same thing but i want to understand is that what happens when we write this line model = Net().cuda(0) ?

  • Does this line calls model.forward() in the backened?

  • Also are these all same model.train(),model.eval(),model?

Thanks in Advance…

Also which one is preferred(if any)…

Found it somewhere on Stack

Its equivalent to keras model.summary() in PyTorch…

from torch.nn.modules.module import _addindent
import torch
import numpy as np
def torch_summarize(model, show_weights=True, show_parameters=True):
    """Summarizes torch model by showing trainable parameters and weights."""
    tmpstr = model.__class__.__name__ + ' (\n'
    for key, module in model._modules.items():
        # if it contains layers let call it recursively to get params and weights
        if type(module) in [
            torch.nn.modules.container.Container,
            torch.nn.modules.container.Sequential
        ]:
            modstr = torch_summarize(module)
        else:
            modstr = module.__repr__()
        modstr = _addindent(modstr, 2)

        params = sum([np.prod(p.size()) for p in module.parameters()])
        weights = tuple([tuple(p.size()) for p in module.parameters()])

        tmpstr += '  (' + key + '): ' + modstr 
        if show_weights:
            tmpstr += ', weights={}'.format(weights)
        if show_parameters:
            tmpstr +=  ', parameters={}'.format(params)
        tmpstr += '\n'   

    tmpstr = tmpstr + ')'
    return tmpstr

# Test
import torchvision.models as models
model = models.alexnet()
print(torch_summarize(model))
# # Output
# AlexNet (
#   (features): Sequential (
#     (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2)), weights=((64, 3, 11, 11), (64,)), parameters=23296
#     (1): ReLU (inplace), weights=(), parameters=0
#     (2): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#     (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)), weights=((192, 64, 5, 5), (192,)), parameters=307392
#     (4): ReLU (inplace), weights=(), parameters=0
#     (5): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#     (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((384, 192, 3, 3), (384,)), parameters=663936
#     (7): ReLU (inplace), weights=(), parameters=0
#     (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((256, 384, 3, 3), (256,)), parameters=884992
#     (9): ReLU (inplace), weights=(), parameters=0
#     (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)), weights=((256, 256, 3, 3), (256,)), parameters=590080
#     (11): ReLU (inplace), weights=(), parameters=0
#     (12): MaxPool2d (size=(3, 3), stride=(2, 2), dilation=(1, 1)), weights=(), parameters=0
#   ), weights=((64, 3, 11, 11), (64,), (192, 64, 5, 5), (192,), (384, 192, 3, 3), (384,), (256, 384, 3, 3), (256,), (256, 256, 3, 3), (256,)), parameters=2469696
#   (classifier): Sequential (
#     (0): Dropout (p = 0.5), weights=(), parameters=0
#     (1): Linear (9216 -> 4096), weights=((4096, 9216), (4096,)), parameters=37752832
#     (2): ReLU (inplace), weights=(), parameters=0
#     (3): Dropout (p = 0.5), weights=(), parameters=0
#     (4): Linear (4096 -> 4096), weights=((4096, 4096), (4096,)), parameters=16781312
#     (5): ReLU (inplace), weights=(), parameters=0
#     (6): Linear (4096 -> 1000), weights=((1000, 4096), (1000,)), parameters=4097000
#   ), weights=((4096, 9216), (4096,), (4096, 4096), (4096,), (1000, 4096), (1000,)), parameters=58631144
# )

This work from Tyler White on Random Forests is awesome…
Have a look at those cool visuals…(just amazing)

Notebook

FYI we already have Learner.summary in fastai :slight_smile:

2 Likes

Have hit this same issue on paperspace fyi.
Graphviz package shows installed and PATH shows
nda3/envs/fastai/bin:/home/paperspace/anaconda3/bin:/home/paperspace/bin:/home/paperspace/.local/bin:/home/paperspace/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(fastai) paperspace@psrxsjv90:~$ which dot
Might require modifying the PATH above.

conda install graphviz did the trick …pip showed an earlier version of the package

fastai) paperspace@psrxsjv90:~$ conda install graphviz
Fetching package metadata …
Solving package specifications: .

Package plan for installation in environment /home/paperspace/anaconda3/envs/fastai:

The following NEW packages will be INSTALLED:

cairo:            1.14.8-0
graphviz:         2.38.0-5
harfbuzz:         0.9.39-2
libtool:          2.4.6-h544aabb_3
pango:            1.40.3-1
pixman:           0.34.0-hceecf20_3

The following packages will be DOWNGRADED:

dbus:             1.10.22-h3b5a359_0    --> 1.10.20-0
fontconfig:       2.12.4-h88586e7_1     --> 2.12.1-3
freetype:         2.8-hab7d2ae_1        --> 2.5.5-2
glib:             2.53.6-h5d9569c_2     --> 2.50.2-1
gst-plugins-base: 1.12.2-he3457e5_0     --> 1.8.0-0
gstreamer:        1.12.2-h4f93127_0     --> 1.8.0-0
icu:              58.2-h9c2bf20_1       --> 54.1-0
libiconv:         1.15-h63c8f33_5       --> 1.14-0
libxml2:          2.9.4-h2e8b1d7_6      --> 2.9.4-0
matplotlib:       2.1.1-py36ha26af80_0  --> 2.0.2-np113py36_0
numpy:            1.14.0-py36h3dfced4_0 --> 1.13.3-py36ha12f23b_0
pcre:             8.41-hc27e229_1       --> 8.39-1
pyqt:             5.6.0-py36h0386399_5  --> 5.6.0-py36_2
qt:               5.6.2-h974d657_12     --> 5.6.2-5

Proceed ([y]/n)? y

icu-54.1-0.tar 100% |################################################################################################| Time: 0:00:01 9.82 MB/s
libiconv-1.14- 100% |################################################################################################| Time: 0:00:00 19.82 MB/s
libtool-2.4.6- 100% |################################################################################################| Time: 0:00:00 20.11 MB/s
pixman-0.34.0- 100% |################################################################################################| Time: 0:00:00 19.72 MB/s
dbus-1.10.20-0 100% |################################################################################################| Time: 0:00:00 21.59 MB/s
libxml2-2.9.4- 100% |################################################################################################| Time: 0:00:00 15.53 MB/s
pcre-8.39-1.ta 100% |################################################################################################| Time: 0:00:00 25.47 MB/s
freetype-2.5.5 100% |################################################################################################| Time: 0:00:00 26.03 MB/s
glib-2.50.2-1. 100% |################################################################################################| Time: 0:00:00 12.68 MB/s
fontconfig-2.1 100% |################################################################################################| Time: 0:00:00 9.16 MB/s
gstreamer-1.8. 100% |################################################################################################| Time: 0:00:00 20.37 MB/s
cairo-1.14.8-0 100% |################################################################################################| Time: 0:00:00 16.87 MB/s
gst-plugins-ba 100% |################################################################################################| Time: 0:00:00 11.86 MB/s
harfbuzz-0.9.3 100% |################################################################################################| Time: 0:00:00 11.32 MB/s
qt-5.6.2-5.tar 100% |################################################################################################| Time: 0:00:05 9.02 MB/s
pango-1.40.3-1 100% |################################################################################################| Time: 0:00:00 7.99 MB/s
pyqt-5.6.0-py3 100% |################################################################################################| Time: 0:00:00 11.72 MB/s
graphviz-2.38. 100% |################################################################################################| Time: 0:00:00 14.03 MB/s
matplotlib-2.0 100% |################################################################################################| Time: 0:00:00 11.83 MB/s
(fastai) paperspace@psrxsjv90:~$ which dot
/home/paperspace/anaconda3/envs/fastai/bin/dot
(fastai) paperspace@psrxsjv90:~$

2 Likes

How to interpret this dendogram?
Link to nbs

Thanks in Advance…

(Trying out fast.ai lib on different datasets and planning to do without it also…)

Also there is issue with the proc_df if you are having missing values (Nas) in the test_set…

2 Likes

Hi

I’m working my way through the ML course and have encountered the following issue on lesson4-mnist_sgd
‘ModuleNotFoundError: No module named ‘torchvision’’, see attached.

I have removed fastai and git cloned fastai again… but error is still triggered?

Any help greatly appreciated

Thanks

Jonathan

pip3 install torchvision

Tried ‘pip3 install torchvision’ but still getting the same error as install errors?