Lesson 4- Neural Networks Question

ctwardy · March 22, 2021, 9:34pm

My answer didn’t quite work for you, so next take a look at similar discussion from other writers first to see if they help:

ReLU unreasonably effective! I think this one is closest to your question. Apparently the best answer was “Handout 3B.” Good band name.
Why can’t ReLU [over]fit my sine wave? Has an insightful example/discussion.
Why ReLU again?
Are vanishing gradients good? Or… why did this work at all?
Google says Swish beats ReLU. (2017, so may not have held up?? But interesting.)

Let us know where you are after looking at a couple of those?