Lesson 10 Discussion & Wiki (2019)

Specifically to the error:

“Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.”

What I understood from Thomas is that pytorch compares the graph from the previous batch with the graph generated by the current batch - if the new graph lacks a node that was in the old graph that’s when you get this error.

That’s why you must repeat calculations that place all the variables that were in the graph on the very first batch, and they can’t be skipped.

I don’t know enough pytorch yet to explain why dynamically skipping a node is considered to be an error. So this is only a circumstantial explanation.

@t-v, please kindly correct me if I my explanation is incorrect or incomplete.

5 posts were merged into an existing topic: Running batch norm tweaks

To get around my conflict issues in conda with pytorch nightly build:

In the past we used local directories to import from and there were no conda installs. So the basic use of datasets is to load data from the net. If fastai is cloned to your local environment or even just the datasets part and imported from there you could work without a conda fastai version.

I took from from the github clone 3 items

datasets.py
core.py
the imports directory with all its contents

and placed in as fastai directory at the same level as exp

After using you step 1 above for the conda no-fastai environment I installed these additions

fastprogess
jupiter
pyyaml and yaml
requests

This gave me an environment which has minimal fastai and which on the running of 08_data_block works fine at least until the image of a man with a TENCH

I also installed for my own use

pandas, pandas-summary, sklearn-pandas
scipy

I hope my memory serves me right here so in case it doesn’t

Here is a package list for such an environment

asn1crypto=0.24.0=py37_0
attrs=19.1.0=py37_1
backcall=0.1.0=py37_0
blas=1.0=mkl
bleach=3.1.0=py37_0
ca-certificates=2019.1.23=0
certifi=2019.3.9=py37_0
cffi=1.12.2=py37h2e261b9_1
chardet=3.0.4=py37_1
cryptography=2.6.1=py37h1ba5d50_0
cudatoolkit=10.0.130=0
cycler=0.10.0=py37_0
dbus=1.13.6=h746ee38_0
decorator=4.4.0=py37_1
defusedxml=0.5.0=py37_1
entrypoints=0.3=py37_0
expat=2.2.6=he6710b0_0
fastprogress=0.1.21=py_0
fontconfig=2.13.0=h9420a91_0
freetype=2.9.1=h8a8886c_1
glib=2.56.2=hd408876_0
gmp=6.1.2=h6c8ec71_1
gst-plugins-base=1.14.0=hbbd80ab_1
gstreamer=1.14.0=hb453b48_1
icu=58.2=h9c2bf20_1
idna=2.8=py37_0
intel-openmp=2019.3=199
ipykernel=5.1.0=py37h39e3cac_0
ipython=7.4.0=py37h39e3cac_0
ipython_genutils=0.2.0=py37_0
ipywidgets=7.4.2=py37_0
jedi=0.13.3=py37_0
jinja2=2.10.1=py37_0
jpeg=9b=h024ee3a_2
jsonschema=3.0.1=py37_0
jupyter=1.0.0=py37_7
jupyter_client=5.2.4=py37_0
jupyter_console=6.0.0=py37_0
jupyter_core=4.4.0=py37_0
kiwisolver=1.0.1=py37hf484d3e_0
libedit=3.1.20181209=hc058e9b_0
libffi=3.2.1=hd88cf55_4
libgcc-ng=8.2.0=hdf63c60_1
libgfortran-ng=7.3.0=hdf63c60_0
libpng=1.6.36=hbc83047_0
libsodium=1.0.16=h1bed415_0
libstdcxx-ng=8.2.0=hdf63c60_1
libtiff=4.0.10=h2733197_2
libuuid=1.0.3=h1bed415_2
libxcb=1.13=h1bed415_1
libxml2=2.9.9=he19cac6_0
markupsafe=1.1.1=py37h7b6447c_0
matplotlib=3.0.3=py37h5429711_0
mistune=0.8.4=py37h7b6447c_0
mkl=2019.3=199
mkl_fft=1.0.10=py37ha843d7b_0
mkl_random=1.0.2=py37hd81dba3_0
nbconvert=5.4.1=py37_3
nbformat=4.4.0=py37_0
ncurses=6.1=he6710b0_1
ninja=1.9.0=py37hfd86e86_0
notebook=5.7.8=py37_0
numpy=1.16.2=py37h7e9f1db_0
numpy-base=1.16.2=py37hde5b4d6_0
olefile=0.46=py37_0
openssl=1.1.1b=h7b6447c_1
pandas=0.24.2=py37he6710b0_0
pandas-summary=0.0.41=py_1
pandoc=2.2.3.2=0
pandocfilters=1.4.2=py37_1
parso=0.3.4=py37_0
pcre=8.43=he6710b0_0
pexpect=4.6.0=py37_0
pickleshare=0.7.5=py37_0
pillow=5.4.1=py37h34e0f95_0
pip=19.0.3=py37_0
prometheus_client=0.6.0=py37_0
prompt_toolkit=2.0.9=py37_0
ptyprocess=0.6.0=py37_0
pycparser=2.19=py37_0
pygments=2.3.1=py37_0
pyopenssl=19.0.0=py37_0
pyparsing=2.4.0=py_0
pyqt=5.9.2=py37h05f1152_2
pyrsistent=0.14.11=py37h7b6447c_0
pysocks=1.6.8=py37_0
python=3.7.3=h0371630_0
python-dateutil=2.8.0=py37_0
pytorch=1.0.1=py3.7_cuda10.0.130_cudnn7.4.2_2
pytorch-nightly=1.1.0.dev20190413=py3.7_cuda10.0.130_cudnn7.4.2_0
pytz=2018.9=py37_0
pyyaml=5.1=py37h7b6447c_0
pyzmq=18.0.0=py37he6710b0_0
qt=5.9.7=h5867ecd_1
qtconsole=4.4.3=py37_0
readline=7.0=h7b6447c_5
requests=2.21.0=py37_0
scikit-learn=0.20.3=py37hd81dba3_0
scipy=1.2.1=py37h7c811a0_0
send2trash=1.5.0=py37_0
setuptools=41.0.0=py37_0
sip=4.19.8=py37hf484d3e_0
six=1.12.0=py37_0
sklearn-pandas=1.8.0=pypi_0
sqlite=3.27.2=h7b6447c_0
terminado=0.8.1=py37_1
testpath=0.4.2=py37_0
tk=8.6.8=hbc83047_0
torchvision=0.2.2=py_3
tornado=6.0.2=py37h7b6447c_0
traitlets=4.3.2=py37_0
urllib3=1.24.1=py37_0
wcwidth=0.1.7=py37_0
webencodings=0.5.1=py37_1
wheel=0.33.1=py37_0
widgetsnbextension=3.4.2=py37_0
xz=5.2.4=h14c3975_4
yaml=0.1.7=had09818_2
zeromq=4.3.1=he6710b0_3
zlib=1.2.11=h7b6447c_3
zstd=1.3.7=h0b5b093_0

I’m not sure what problem you’re trying to solve, @RogerS49 - just install fastai in whatever way you like - conda, pip, local checkout and it just works with the part2 lessons.

Well I got rid of my conflict issues in conda with pytorch nightly build. This makes more sense to me as whats in those packages and dependencies are not really fastai it seems except around data URLs. I managed to run the whole of the 08_data_block notebook, perhaps I may run into other dependency issues I am not aware of. Thanks for your reply.

1 Like

You could always just download the source via github and place it under the lessons

git clone https://github.com/fastai/fastai_docs
git clone https://github.com/fastai/fastai
cd fastai_docs/dev_course/dl2
ln -s ../../../fastai/fastai .

so now when you load a notebook from fastai_docs/dev_course/dl2 it will use these local fastai modules since '' (nb dir) is always in sys.path. This way you don’t have any dependency conflicts to deal with since you’re not using any package manager here.

This is more or less what you suggested you did above, just easier since you don’t need to go and fish out specific files from the fastai modules.

1 Like

I just quickly plotted the layer norm vs batch norm for the sunny vs foggy day to double-check Jeremy’s thought on why layer norm doesn’t do well. And the plots confirmed it.

Sunny road and foggy road before (top row) and after (bottom row) applying layer norm


Really hard to tell which one is sunny/foggy for the bottom 2 images.

For comparison, I did the the same processes for batch norm


Way better!

9 Likes

Although I can’t offer a resource, I can offer empathy. I was fairly relieved when Jeremy noted that his utility function for loading images took a week to develop. I would’ve felt like throwing in the towel if he had said he wrote it while eating breakfast one morning.

3 Likes

feel same. :frowning:

1 Like

the Gof book has formed many software engineers (me included) : https://en.wikipedia.org/wiki/Design_Patterns

1 Like

Go for The Swift Programming Language, a nice resource.

You can read about design patterns but in real life nobody uses them.

1 Like

whats you PoV on these books, also newer versions like clean code or refactoring ?

Personally, while I totally get the idea of patterns and clean code, I find many books and articles on the subject verbose, sometimes a little dogmatic, I do not agree necessarily in the details (typically I find they make simple things complex honestly) and always dry to read. Maybe its like writing a book about salsa dancing: style matters but you just get it on the dancefloor (never ever with a book).

:slight_smile:

the clean code reference looks like good housekeeping rules.
concerning refactoring i have not read the book. However Martin fowler is one of my heroes and with a foreword of Eric Gamma (one of the authors og the Gof book) it doesn’t get better.

I think that design pattern are important in the same way that we expect certain components to be standardised when building a house. It is just too much mental overhead (and often short sighted) if everybody invents their own personal way of doing things. This is not to says that a design pattern is implement in identical ways in every language but the concept should transcend languages.

I know some people have this point of view, and that’s fine. Personally however I find the exact opposite - I’ve found trying to shoehorn things into a set of predefined patterns limits my thinking and is harder for me to understand than avoiding that idea entirely.

2 Likes

Many design patterns (if not all, AFAIK) focus on Object Oriented programming paradigm. We are dealing with a mix of Object Oriented, Functional and Dataflow paradigms. This makes OO patterns partially applicable, but not that useful within a bigger picture. We need a new methodology and new design patters to emerge.

Fastai programming style gives us an interesting example and insights into what these patterns might be. Fastai offers examples of well thought-through use of decorators, closures, partials and compose. I wish Software Engineering methodology researchers paid more attention to it.

2 Likes

I like your point about the over-emphasize of OO patterns, I guess if I would find a book with coding patterns that look beyond languages and specific paradigms it would definitely be worth the read and the fast.ai code to me is the best source I am aware of. I still suspect programming like speaking a language is a skill, where you can’t just learn grammar and some elegant ways to express yourself to become a master.

good point concerning functional programming. The processor pattern is fastai is a good match for that

1 Like

If you find such a book, please let me know.

1 Like

I see Jeremy here being complementary about Fowler

So maybe I give his newly revised book on refactoring a closer look another time

1 Like

So I was revisiting lesson 10 and had the same thought. We have this awesome GeneralRelu, why don’t we just learn all the parameters instead of predefining them. So i searched around to see if anyone on here had done it and i couldn’t find anything.

So I went ahead and implemented “LearnedRelu” which was super easy (assuming i did it right):

class LearnedRelu(nn.Module):
    def __init__(self, leak=0.1, sub=0.25, maxv=100):
        super().__init__()
        self.leak = nn.Parameter(torch.ones(1)*leak)
        self.sub  = nn.Parameter(torch.zeros(1)+sub)
        self.maxv = nn.Parameter(torch.ones(1)*maxv)

    def forward(self, x): 
        x = F.leaky_relu(x,self.leak.item())
        x.sub_(self.sub)
        x.clamp_max_(self.maxv.item()) 
        return x

So far it seems to work great. I started a separate thread on the topic with a gist of my work so far here: https://forums.fast.ai/t/learning-generalrelu-params-here-is-learnedrelu/44599