Fastbook Chapter 1 questionnaire solutions (wiki)

I’m making an Anki deck of answers to these questions, as well as definitions, terms, and key ideas as I come across them in the course. I could make it public if that would be helpful, but will wait to receive permission to do so. :slight_smile:

3 Likes

I think it is fine to make them available in this forums, but not public. They can be public once the course is fully released in July.

1 Like

I have answered all the questions in this questionnaire! :slight_smile:

Please let me know if there are any errors. Or feel free to add more info to the solutions!

3 Likes

About point 16, what about saying that you need metrics to quantitatively measure the performance, and moving the loss function in the part about updating the parameters, together with the optimizer? I think it is more relevant to the latter (and with point 24).

I think is the key statement here. I don’t want to edit anything without approval. Thanks.
The same stands about what follows:

About point 18: It seems that you are stating that the only limits are memory and processing power, suggesting that one can ramp up training image size more or less indefinitely. Kernel/receptive field size is not mentioned, as well as pretraining size and other similar stuff.
I would have something to say about loss of generalization power as a cnn sees the features at (too much) different scales w.r.t. pretraining and kernel size, but I’m still investigating. Let’s just say that it could be safer to advise not to exaggerate with respect to the pretraining size.
Or maybe it could be mentioned in point 25?

In point 20, the hyperparameters are mentioned before defining them in point 32. Furthermore, maybe it could be nice to explicitly state the difference with the parameters no matter having previously defined them.

What do you think?

Thanks for your feedback. I just want to point out that my responses are based on what is supported by the chapter text. While you make some good points, these are not included in the chapter text.:

Based on my understanding, it is true a metric is required to evaluate a model, but is not needed to train the model. Typically, the metric is not used in any way during training, but to evaluate which model to select for final use. Anyway, that’s my understanding, correct me if I’m wrong :slight_smile:

You make some good points here. Personally, I am not too sure about this, mainly because I am not sure how the adaptive pooling layer (what allows the classification model to handle variable image size) would affect these factors. I think these factors are definitely more important with respect to pretrained models, which have learned a set pixel scale, but it’s possible that during training the model adjusts for the different scale of the training images?

My answer for that question is based on the text in the chapter:

Why 224 pixels? This is the standard size for historical reasons (old pretrained models require this size exactly), but you can pass pretty much anything. If you increase the size, you’ll often get a model with better results (since it will be able to focus on more details) but at the price of speed and memory consumption; or vice versa if you decrease the size.

Thanks for pointing this out. I will add a statement, “see point 32”, when referring to hyperparameters. The assumption is that the reader has already read the chapter and is now answering the questionnaire. So, the reader should already know what hyperparams are. Nevertheless, it is still helpful to point out the definition.

Thanks for reading my solutions and giving feedback!

1 Like

Here’s the Anki deck I’ve made so far for Lesson 1. I included the historical questions that seemed like trivia a DL practitioner might be assumed to know (like the question about the Mark I Perceptron), but not the questions like what the eight requirements for Parallel Distributed Processing are. I’ve also included cards for questions I had while reading Chapter 1 (what is a feed-forward neural network?), and the shortcuts for Jupyter notebooks.

7 Likes

@go_go_gadget that would be a great link to add to the lesson 1 wiki thread, if you’re open to doing that :slight_smile:

1 Like

It was a pleasurable reading :slight_smile:

1 Like

Not at all. I think I misconstrued your answer to that question. My previous observation was merely motivated by the fact that the whole optimization process (one iteration thereof) starts by taking the output of the loss function and calculating its gradient. The metrics as means to evaluate the model’s performance during training are limited to the human practitioner (“should I stop now or go on?”).

Will do!

1 Like

Re: #16

  1. What do you need in order to train a model?

You will need an architecture for the given problem. You will need data to input to your model. You will need labels for your data to compare your model predictions to. You will need a loss function that will quantitatively measure the performance of your model. And you need a way to update the parameters of the model in order to improve its performance (this is known as an optimizer).

Would it be appropriate to qualify that labels are needed specifically for supervised learning, as opposed to all DL models?

2 Likes

Yes this is true. Since the distinction between supervised learning, semi-supervised learning, and unsupervised learning is not made clear in this first chapter, I will just say “For most cases”.

2 Likes

In reference to your point about #16: From my understanding, it makes sense to separate the loss function and the optimizer. Before you can optimize and update weights, first you must know how good/bad the current weights are. Therefore, it’s important to recognize that we need a loss function in order to train a model. Point 24 is just following up and making sure we know what a loss function is.

I was wondering if it’s ok to create blogs answering the questions in the book?

I don’t see why not!

Don’t you think Answer of 13 should be this:

2 Likes

Yes I do :slightly_smiling_face:

1 Like

:smiley:

Where do we set the “loss” criteria for the actual optimization method? Is it hardcoded in the “architecture” ? In codes of chapter 1, only metric is set…

Hi,
I created this article from Lesson 1. It has the answers to the first set of questions. Thanks for the awesome course:

1 Like