i have a question : Why do we haven’t had Pclass_3 column in the exel table at 1:09:40 of the video
There was a similar (or same? ig) question asked during the lecture.
pclass can take three values, i.e., 1, 2, and 3.
pclass_1’s value is either 0 or 1. 0 indicates that
pclass’s value is not 1 and 1 indicates otherwise.
pclass_2’s value is, again, either 0 or 1. Again, 0 indicates that
pclass’s value is not 2 and 1 indicates otherwise.
For example, say,
pclass has a value of 3. This would mean that
pclass_1 = 0 and
pclass_2 = 0 which implies that a variable
pclass_3, should we choose to define it, would have a value
pclass_3 = 1. But as you might have noticed in this example,
pclass_2’s values are enough to tell that the value in
pclass is 3. A new variable,
pclass_3, would be redundant for the task at hand and the model does just fine without it.
You can get the full set like this:
image_path = untar_data(URLs.MNIST)
I ended up figuring this out - honestly I think this question for chapter 4 of the book is premature and not very helpful, as chapter 5 basically answers it and shows a more efficient way of formatting this data. I ended up taking a small pytorch tutorial to figure out what I was doing wrong!
Do you mind sharing how you formatted your data (a Colab or GitHub link would be fine)? I found it to be pretty straightforward because I first visualized the input shapes and output shapes for the matrix multiplication operations (different layers).
Here’s how I did it: AsquirousSpeaks - Classifying handwritten digits (THE MNIST!)
Also, please share your implementation for all the 10 digits as I’d like to see the implementations of SGD for the full dataset and if it differs from what’s there in chapter 4 (I personally didn’t do the whole SGD thing and used fastai methods).
Got it , Thank you
The “Universal Approximation Theorem” states that a neural network with 1 hidden layer can approximate any function.
If we use ReLu, are we not using only functions with positive slopes?
Probable I am misunderstanding some concepts but I couldn’t figure it out by my self.
There was a similar question asked in the discord server here. I hope that answers it
Hi everyone! I just completed training a model for the full MNIST dataset, adapting the content from chapter 4 in the book to work for multiple digits. Here’s my blog post about it Fast.ai Chapter 4: Full MNIST Challenge | by Jack Driscoll | Nov, 2023 | Medium
Feedback is welcome!
I’ve been doing a recap+quiz blogpost for the lessons.
Here’s lesson3: Giant Morons 🧠 - FastAI Lesson 3
It features a tenacious animal, brought to you by dall-e, which I generated to inspire me. Read on to find out which animal!
My plan is to feature a new tenacious animal for every lesson going forward, so that’ll be your enticement for reading future posts (if my prose doesn’t do it )
Hi would like some help for clarity. To better follow the book I was converting implicit using of parameters to explicit. It stoped working when started using PyTorch Linear model as I assume it operates on implicitly having certain variables. What would be a best resource to know which variables are expected implicitly, e.g.
lr and so on?
In my case
Linear.forward() will implicitly take parameters and learning data to perform equivalent of the following. But it is not clear for me how it does.
def linear1(xb): return xb@weights + bias or def linear1_explicit(xb, weights=weights, bias=bias): return xb@weights + bias
Thank you in advance!
Hi everyone! I loved lesson 3 and in particular the idea of using a simple spreadsheet to demonstrate that the core technique behind deep learning (gradient descent) is not “rocket science”… even if using a potentially complex solver as a black box to optimize the params seems to be a bit of a cheat ;).
As an exercise, I rewrote the gradient descent solution for Titanic survival predictions as a Kaggle notebook. (It’s also my first Kaggle notebook, thanks to the fastai course I’m discovering a lot of cool tech for the first time .) Feedback welcome!
Thank you for this amazing course to Jeremy and to everyone in the community, you rock!
Thank you! Signed up here just to post about the same thing. I was re-creating the quadratic example from scratch, and found that when I graphed the loss function it plotted out what looked like a sine wave. Asked ChatGPT, gave it my code, and it said that I appeared to be missing the code to zero out the gradient within the loop.
In the Titanic example, the Lin2 and ReLU 2 seem to be using the raw data as input against the weights. However, earlier in the course, it is mentioned that, for each layer, we’re
using the outputs of the previous layer as the inputs to the next layer
Shouldn’t Lin1 or Relu1 be passed as input to ReLU2? Or where does this “input to the next layer” process happen?
or perhaps it’s a “one layer” neural network?
Has anyone had success actually using the model to predict whether an image is a 3 or a 7? I tried learn.predict similarly to the bear classifier but running into errors. Here is my simple test
x,y = first(dataloader) print(learn.summary()) print("X SHAPE",x.shape) learn.predict(x)
Here is the output
256 x 30
Linear 23550 True
256 x 1
Linear 31 True
X SHAPE torch.Size([256, 784])
And then Im getting the error
‘list’ object has no attribute ‘decode_batch’
My x and the models input are the same shape. Does anyone know what is causing this ?