“Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.”
What I understood from Thomas is that pytorch compares the graph from the previous batch with the graph generated by the current batch - if the new graph lacks a node that was in the old graph that’s when you get this error.
That’s why you must repeat calculations that place all the variables that were in the graph on the very first batch, and they can’t be skipped.
I don’t know enough pytorch yet to explain why dynamically skipping a node is considered to be an error. So this is only a circumstantial explanation.
@t-v, please kindly correct me if I my explanation is incorrect or incomplete.
To get around my conflict issues in conda with pytorch nightly build:
In the past we used local directories to import from and there were no conda installs. So the basic use of datasets is to load data from the net. If fastai is cloned to your local environment or even just the datasets part and imported from there you could work without a conda fastai version.
I took from from the github clone 3 items
datasets.py
core.py
the imports directory with all its contents
and placed in as fastai directory at the same level as exp
After using you step 1 above for the conda no-fastai environment I installed these additions
fastprogess jupiter pyyaml and yaml requests
This gave me an environment which has minimal fastai and which on the running of 08_data_block works fine at least until the image of a man with a TENCH
I also installed for my own use
pandas, pandas-summary, sklearn-pandas scipy
I hope my memory serves me right here so in case it doesn’t
I’m not sure what problem you’re trying to solve, @RogerS49 - just install fastai in whatever way you like - conda, pip, local checkout and it just works with the part2 lessons.
Well I got rid of my conflict issues in conda with pytorch nightly build. This makes more sense to me as whats in those packages and dependencies are not really fastai it seems except around data URLs. I managed to run the whole of the 08_data_block notebook, perhaps I may run into other dependency issues I am not aware of. Thanks for your reply.
so now when you load a notebook from fastai_docs/dev_course/dl2 it will use these local fastai modules since '' (nb dir) is always in sys.path. This way you don’t have any dependency conflicts to deal with since you’re not using any package manager here.
This is more or less what you suggested you did above, just easier since you don’t need to go and fish out specific files from the fastai modules.
I just quickly plotted the layer norm vs batch norm for the sunny vs foggy day to double-check Jeremy’s thought on why layer norm doesn’t do well. And the plots confirmed it.
Sunny road and foggy road before (top row) and after (bottom row) applying layer norm
Although I can’t offer a resource, I can offer empathy. I was fairly relieved when Jeremy noted that his utility function for loading images took a week to develop. I would’ve felt like throwing in the towel if he had said he wrote it while eating breakfast one morning.
Personally, while I totally get the idea of patterns and clean code, I find many books and articles on the subject verbose, sometimes a little dogmatic, I do not agree necessarily in the details (typically I find they make simple things complex honestly) and always dry to read. Maybe its like writing a book about salsa dancing: style matters but you just get it on the dancefloor (never ever with a book).
the clean code reference looks like good housekeeping rules.
concerning refactoring i have not read the book. However Martin fowler is one of my heroes and with a foreword of Eric Gamma (one of the authors og the Gof book) it doesn’t get better.
I think that design pattern are important in the same way that we expect certain components to be standardised when building a house. It is just too much mental overhead (and often short sighted) if everybody invents their own personal way of doing things. This is not to says that a design pattern is implement in identical ways in every language but the concept should transcend languages.
I know some people have this point of view, and that’s fine. Personally however I find the exact opposite - I’ve found trying to shoehorn things into a set of predefined patterns limits my thinking and is harder for me to understand than avoiding that idea entirely.
Many design patterns (if not all, AFAIK) focus on Object Oriented programming paradigm. We are dealing with a mix of Object Oriented, Functional and Dataflow paradigms. This makes OO patterns partially applicable, but not that useful within a bigger picture. We need a new methodology and new design patters to emerge.
Fastai programming style gives us an interesting example and insights into what these patterns might be. Fastai offers examples of well thought-through use of decorators, closures, partials and compose. I wish Software Engineering methodology researchers paid more attention to it.
I like your point about the over-emphasize of OO patterns, I guess if I would find a book with coding patterns that look beyond languages and specific paradigms it would definitely be worth the read and the fast.ai code to me is the best source I am aware of. I still suspect programming like speaking a language is a skill, where you can’t just learn grammar and some elegant ways to express yourself to become a master.
So I was revisiting lesson 10 and had the same thought. We have this awesome GeneralRelu, why don’t we just learn all the parameters instead of predefining them. So i searched around to see if anyone on here had done it and i couldn’t find anything.
So I went ahead and implemented “LearnedRelu” which was super easy (assuming i did it right):
class LearnedRelu(nn.Module):
def __init__(self, leak=0.1, sub=0.25, maxv=100):
super().__init__()
self.leak = nn.Parameter(torch.ones(1)*leak)
self.sub = nn.Parameter(torch.zeros(1)+sub)
self.maxv = nn.Parameter(torch.ones(1)*maxv)
def forward(self, x):
x = F.leaky_relu(x,self.leak.item())
x.sub_(self.sub)
x.clamp_max_(self.maxv.item())
return x