Lesson 8 (2019) discussion & wiki

also swift does not have macros and hopefull never will:)

1 Like

I spent years trying to get numerical programming working well in C++. It was an absolute nightmare. You have to deal with issues like memory aliasing, and trying to convince the compiler to actually vectorize your loop, and the complexity of expression templates and C++ meta-programming (if you want things to be reasonably concise).

There’s also no good package management, terribly long compile times, complex memory management, and so many other horrors. C++-17 is certainly a lot better than the old days, but I’d still rather avoid C++ if at all possible!

9 Likes

I wish it had Julia’s meta-programming though…

2 Likes

Yeah so not really a Leaky Relu. :smiley:
I guess I was trying to ask if Leaky Relus do better than regular Relus because of the normalization thing discussed.

I think the fact they have non-zero gradient everywhere is the more important issue.

2 Likes

Hi All,

I have a doubt regarding when Jeremy was discussing matrix multiplication with broadcasting. Cell with name Matmul with broadcasting
At the point where Jeremy mentions the shape of the expression
a[i,:].unsqueeze(-1)

He said, the shape is [ar, 1]. I think the shape should instead be [ac, 1].
Could anyone please verify? Or am I missing something here ?

Regards

1 Like

I think you are right!

1 Like

The edited lesson video doesn’t seem to be working for me. Anyone else have problems with it?

Hi Jeremy…!
The link for the edited video isn’t working . It’s not showing anything but just a blank screen. It might be some temporary issue but if it’s not (in case) please update the link…

It’ll take ~20 mins to process.

2 Likes

Oh Thanks…!

About a=math.sqrt(5), it seems due to a refactor of initialisation code … this can shed some light

1 Like

Turns out to be more complex than that - I’ve gotten to the bottom of it now, and will be documenting what I found for next week.

2 Likes

Besides the link with rationale for swift (@alexandrecc you can read why C++ didn’t make it):

there’s a second document https://github.com/tensorflow/swift/blob/master/docs/GraphProgramExtraction.md describing the programming model.

Our user model fits in a single paragraph: you write normal imperative Swift code against a normal Tensor API. You can use (or build) arbitrary high level abstractions without a performance hit, so long as you stick with the static side of Swift: tuples, structs, functions, non-escaping closures, generics, and the like. If you intermix tensor operations with host code, the compiler will generate copies back and forth. Likewise, you’re welcome to use classes, existentials, and other dynamic language features but they will cause copies to/from the host. When an implicit copy of tensor data happens, the compiler will remind you about it with a compiler warning.

So the goal was to pick up a language which has an expressive subset susceptible to static analysis.

1 Like

Yes my first message was when Jeremy showed us your matrix calculus paper.

Bookish seems like a great tool, I think I’m gonna use it so thanks for writing it !

Btw, I don’t know if you saw my tweet about it but in the beginning of that matrix calculus article you mention the ‘Theory’ category of the fastai forums that doesn’t seem to exist anymore… Maybe you could remove that mention ?

Thanks! You’re right is more complex than that. Just found this paper. Maybe it can help

Dying ReLU and Initialization: Theory and Numerical Examples

Sounds like maybe we need to resurrect the theory forum! easier than changing the paper I would say.

And the bookish repo is really not ready for external users as I’m focused on making it work for my stuff but feel free to check out how I did it.

1 Like

No one was using it, so it got merged into #deep-learning .

1 Like

__call__ is a function that helps you call the object by passing parameters inside of it. So yes it is working like forward. If you are asking shouldn’t we use forward? This is solved by using Module in which __call__ calls both forward and backward. All we need to do now is subclass Module for our custom layers and models; here we can use forward and there is no need for __call__. Hope this clarifies your confusion.

2 Likes

It exploits BLAS I guess