@JanM I think you will find the next meeting in this thread also. It’ll probably be on Saturday the 21th. The meeting might be moved to 2PM GMT. But if you come around a day earlier all the info here should be up to date.
Hi Jan,
I updated the thread, the next meetup is on coming Saturday at 2pm GMT and you are very welcome to join! There is also a slack channel here.
Ho ho ho there is a meetup today at 2PM GMT!
Meeting Minutes 21-12-2019
- New Participants
- IMDB Movie Review Walkthrough by @msivanes
- MoviePoster MultiLabel Classification project by @Jan
- Tabular Data Walkthrough by @gagan
Notes
- Use a smaller sample set before diving in to the full dataset
- Make sure you store the
vocab
andfine_tuned_enc
when create your language model. - To go to the source code ->
learn.recorder.plot_losses??
Questions
- How to determine the layers when creating
TabularLearner
?
Resources
I wanna join the meetup
@lesscomfortable I’m a little confused about how fitting works? suppose I ran train model for 3 epochs first and then 2 epochs . is that equals to running 5 epochs in one go?
The meetup is on now! Everyone is welcome to join!
#Meeting Minutes 04-01-2020 (Thanks @msivanes for the inputs)
## Notes
- New Participants
- Walkthrough of notebook on Yelp Reviews to explore how fine tuning helps in handling Out of Vocabulary(OOV) words in language model by @msivanes. Words that do not appear in the wiki text & very specific to our domain are initialized with random weights & they are learned as part of fine tuning. This was based on learning from notebook[1] created by @pabloc. For more discussion see [2].
## Advice
- Use a smaller sample of the dataset before diving into full dataset. This allows for faster training & quicker iteration.
## Questions
- How to override the 60000 limit on vocab while creating Language Model?
- When we freeze a model for fine-tuning, do the layers become untrainable or the layer-groups?
## Off topic
- @gagan is trying to create a language model for assamese language (one of the low resource language).
-
@msivanes shared about Masakhane[3], group of African researches trying to build translation models for low resource African language
## Resources
[1]Tutorial on SPAM detection using fastai ULMFiT
[2] Adding new word to ULMFit in fine-tuning step
[3] https://www.masakhane.io/
We decided to rotate the presentation of the lessons among us. As you know, our meetings are informal, so this is basically like explaining the material to friends and boosting your presentation skills in a supportive environment For a learner, it is one of the best ways to actively engage with the material and actually learn better by explaining. So choose the lesson that you want to understand better yourself. The lesson’s recap should be short (~15-20 mins) covering main concepts in a simple language. So grab a chance and please write which lesson you would like to present Of course, all newcomers are welcome
I started the lessons a couple days ago. just came across this thread! Would love to join the next session and learn from everyone. Thank you for this initiative!
Just a reminder, the meetup will be held today in ~1h
The meetup is on
There should be a reminder!!! one day prior to the meeting.
Thanks for the feedback. If there’s an automated way to do that, let the group know.
It’s much easier for us as community members to create a personal calendar event with the information shared in the wiki to make that happen. @shahnoza is volunteering the time to host this and kind enough to provide zoom for this study group. I try not to ask the host to do more work than needed.
Thanks for the feedback. Currently, @shahnoza does remind us on the group regarding the meetups. However, based on the feedback, I have now added an automated reminder to the slack group. We should be getting a reminder on Fridays. As @msivanes pointed out, it would be easier for the members to create a personal calendar event.
Meeting Notes 11-01-2020
-
Announcement about restart of cadence (starting Lesson 2) by @shahnoza
-
Lesson 2 review by @gagan
-
Classifier for pen vs pencil followed by questions. @gagan actually timed it starting from data collection to the inference & time taken is 23 min to demonstrate fastai is really FAST AI . (@gagan++) Colab
-
Conceptual Framework of Supervised Learning (Gradient, Parameters, Loss, Model, Observations, Targets) by @msivanes for lesson2 - sgd.
-
-
Projects Showcase
Advice
- Stacked transfer learning - use fine tuning on smaller 224 images (fine tuned) followed by using actual image size data (fine tuned)
Discussion
-
Class Imbalance : Is it still valid when we use transfer learning?. It might due to the fine tuning. The best thing to do is try it out as Jeremy said.
-
num_workers : number of cpu cores to speed up the data grabbing process. If you are get out of memory error, reduce num_workers to a smaller number or reduce the batch size(bs).
Resources
-
No Classification without Representation showing ImageNet & OpenImages data are imbalanced
@AjayStark
The top post(wiki) has all the information that you need to participate in the study group & in the discussions. Let us know if you face any difficulties with anything specific.
Lesson 5
I shared the below image in our previous meetup. This is an updated version along with annotated code and notes.
Building a minimal Neural Network (Logistic Regression with no hidden layer) from scratch
Let’s walk through step by step and also refer how we code each blocks from the below image
Source: Natural Language Processing with PyTorch by Delip Rao et al.
-
Predictions:
y_hat = model(x)
, here we are using own model. -
Loss function:
loss_func(y_hat, y)
. In addition to that we are also adding it withw2*wd
-
Gradients:
parameter.sub_(learning_rate * gradient)
, performing an inplace subtraction on parameters with product(learning_rate, gradient). But since our model has multiple parameters (weights, biases), we are looping through them using PyTorch parameters. -
Extras:
-
Weight Decay:
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 ,
for p in model.parameters(): w2 += (p**2).sum()
- b) wd: a constant (1e-5)
- multiply w2 and wd & add to regular loss_func
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 ,
-
Weight Decay:
-
Combined
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them.
losses = [update(x,y,lr) for x,y in data.train_dl]
-
.item()
turns into a python number in order to plot & see them visually.
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them.
def update(x, y, learning_rate):
wd = 1e-5
#prediction
y_hat = model(x)
w2 = 0.
#sum of squared weights
for p in model.parameters():
w2 = w2 + (p**2).sum()
# regular loss
loss = loss_func(y_hat, y) + w2*wd
# updates the gradients in the model ie parameters
loss.backward()
# instruct pytorch not to record these actions for the next gradient calculation
with torch.no_grad():
for p in model.parameters():
#gradients
p.sub_(learning_rate * p.grad)
p.grad.zero_()
return loss.item()
Resources
- Many thanks to (@cedric ) for his notes Building a neural network from scratch
- Pytorch Tutorial also has this in depth article what is torch.nn really? (somewhat hidden gem) from Jeremy
Feedback is welcome.