- Welcome to lesson 2
- Reminder about topics on the forums, pinned, FAQ with important information, important links, follow those, use official discuss
- For each lesson there will be an official updates thread, that are wikified
- watch threads => get notifications
- Overwhelming: over 1k replies on some threads, click summarize topic. That why it is important to click like button
- Returning to work steps. step 1: latest nb: 2 latest python libraries
- Share your work here: Example what people have been doing.
- Sounds data in a paper
- SOTA on DHCD, Text recognition, confirmed
- Issue of methastasising cancer, point mutations, turning them into pics and using lesson 1 approach. Uses a lot of domain expertize
- Cougar or not
- Simon Willison, django …
- Inspiring blog post: bird classifier, he nearly didn’t start, you can do it without reading greek
- I want to contribute to the library
- Cool classifiers, hairy dogs, anime hair color, buses
- which of country a satellite image is of
- Batic cloth
- regognize incomplete buildings
- Next steps, vision, npl, tab, colab filter, embeddings, more vision, more nlp. Better for learning seeing things multiple times
- If you are stuck, keep going, experiment, push on, 1-7 videos 3 times
- Learning theory: the whole game
- Lesson 2 download: how to create your own classifier with your own images
- Teddy bear detector
- Getting images from Google Image Search
- 4. Create
ImageDataBunch
, also creating validation set - fixed random seed. - Is it stable. Always get the same validation set. To know if tweaks are improving
show_batch
, some are tricky, some are not a photocreate_cnn
-> train- about learning rate finder, strongest downward slope that is sticking around for a bit
- Most parameters dont matter that much in detail
- Interpretation
- Less noisy dataset - Combining human expert with computer learner
- File deleter widget ->
interp.top_losses()
- Afterh having cleaned, you can relearn, it can handle noisy data, if they are randomly noisy
- Putting the model into production
- You will run on a CPU in production
- classes, in order
- Example: teddy bear detector…
single_from_classes
, pass same info we trained with- do this once when server starts, then learn.predict
- Code in production, starlette
- Free hosting, Python anywhere and Docker alternative
- Cleaning, with Filedeleter,
top_loss_paths
- Creating applications in notebooks. find out source code with ??, it looks normal for gui programmers
- ipywidgets
- Building a production webapp
- Code snippet of Production webapp
- What happens when you have problems, and how to fix, examples: learning rate, nr of epochs, trying with different values
- Learning rate too high -> validation loss gets really high
- Learning rate too low -> Error rate gets better really slowly, and training loss will be higher than validation loss
- Too few epochs
- Too many epochs => overfitting, it is hard to overfit with dl
argmax
- testing argmax with
learn.predict()
source code - q: Definintion of how error rate is calculated
doc()
- q: why using 3 in learning rate: good default for initial finetuning. learn.fit_one_cycle(4, 3e-3) … unfreeze, then learn more, take waht i did last time and divide by 10, and value from learning rate finder
- More math, animated gif of numbers
- How do we create one of these functions
- y=ax+b = a line
- Linear algebra
- Maltrix multiplication animation
- q: when generating imageset, how many images
- q: what do you do if you have unbalanced classes, few teddys, many grizzlys
- q: unfreeze, if loss is lower than valid loss, do you train unfrozen, or redo everything?
- q: showing code sample, cnn resnet34, copy of model? pretrained?
- Why do we do all this? => pytorch no loops, single line of code.
- SGD
- Lesson2-sgd notebook
- x@a = matrix product
- Tensor: an array of a structrued size
torch.ones()
- index into that [:,0] => means every single row of column zero
- function_ => dont return, but replace
- scatterplot:
plt.scatter()
in matplotlib - How would we draw a line to fit at this data?
- Digesting SGD, people have hardest time with beacuse we cant conceptionnalize a function with 50 million numbers
- Find pytorch parameters so that the line minimizes the error between the line and points
- Regression, loss, mse, mean squared error
- Experiment with code easier than with math
- Coming up with the line, we have to guess
- Calculating predictions, mse, with our ranom numbers, we get a loss
- Gradient Descent and scatterplots, changing the numbers a little bit
- You dont have to move it, you can calcuate derivative
grad()
calculating gradient &update()
,backward()
- Learning Rate
- Running update lots of times
- Animation with matplotlib
- Try run the sgd with high an dlow lr
- Minibatches
- Vocab
- Recap of math
- Closing with idea of underfitting /overfitting
- Regularisation
- How and why to create a good validation set
18 Likes