Part 1, online study group

Curious general question: how do you guys keep track of changes in your model. For instance, when you change parameters or change resnet#. Do you guys save the graph in word document, use a software to compare different runs or what do you guys do?

Personally, i had been trying to use Wandb to compare different models but still trying to learn how to use different features. For instance, if i wanted for it to grab a graph of a certain run and the others to compare how its changing.

1 Like

Highly interested. expecting an invitation. Thanks

You can join our open slack group. Link is in the first post of this thread. The timings of the next meetup will be updated soon. :slight_smile:

Hi hi, just a reminder, the meetup is today at 4PM GMT ( 8AM PST = 9:30PM IST = 11AM EST) :slight_smile: It is dedicated to NLP and based on Lesson 4. Click to the Zoom link when it is a time.

1 Like

The meetup is starting in ~11mins, zoom link is on!

This worked when I tried few months back, not sure if this still works but you can try this hack to automatically reconnect colab notebook. https://stackoverflow.com/a/58275370/5951165

2 Likes

Meeting Minutes for 01/26/2020

Presentation on Lesson 4

Questions

  • What does np.random.seed(2) do?
    This post & this one addresses them

The ImageDataBunch creates a validation set randomly each time the code block is run. To maintain a certain degree of reproducibility the np.random.seed() method is built-in within the fastai library.

  • How to stop Google Colab from disconnecting?
    @vijayabhaskar shared this in the previous comment

Challenges

  • Many participants shared the common challenges that they face. This deserves a separate post (will summarize this & possible options soon)

Misc

  • Organizational stuff

Resources

2 Likes

I wrote my first post using the fast_template shared by Jeremy, which uses Github pages and Jekyll. I hope it can be helpful to people.

5 Likes

Hi gagan hope you’re having a wonderful day!

I found your first post informative and a real joy to read.

Cheers mrfabulous1 :smiley: :smiley:

1 Like

Thanks @mrfabulous1!! I’m glad you found it useful. !!

Hi people!!! As part of this study group, we are starting an algorithms meetup to hone our expertise in using data structures and algorithms, which can be useful for interviews as well.

Preparing for Leet-code styled coding interviews can be a very challenging task because the material is scattered and finding the perfect explanation for the problem can take time. I, along with a friend prepared for these interviews and I intend to cover some patterns that we learnt, (related to data-structures and algorithms) that were useful to us. We both got a new job after weeks of preparation and iteratively figuring out how not to fail. Please note that I will be just sharing my experience and by no means am I an expert (yet ). I hope my experience will help others in solving such coding problems and nailing that interview!!!

People who are interested can join the slack for our study-group using the link in the first post of this thread. (We would be using the #coding_interview_prep channel for this specific purpose)

3 Likes

Just a reminder, there is a meetup today, at 4PM GMT :wink: We will focus on Lesson 4!

1 Like

Hello all! first time in this meetup… just started lesson 4 today :slight_smile:

4 Likes

Right on time @oscarb :slightly_smiling_face:

Meeting Minutes of 02/02/2019

Presentation on Lesson 4 (Tabular and Collaborative Filtering)

Presenter: @Tendo

Thanks to @Tendo for the wonderful Colab notebooks!

Questions

Tabular Data:
  • What are the heuristics or the formula for determining the size of the hidden layers for the tabular learner?

    learn = tabular_learner(data, layers=[200,100], metrics=accuracy)

    • Forum thread for reference and possible further discussion linked below in Resources
  • In Tendo’s notebook, total size of training set was 3256, so if we choose rows 800-1000 to be our validation set, that means, with 200 samples, we have a validation set that is around 6% of the training set. Is that enough?

    test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names, cont_names=cont_names)

    • I didn’t quite gather if we fully resolved this in the discussion
    • Also, why 800-1000? Can we not achieve a more random split by using ratio/percentage like in sklearn?
      • one reason could be that we want a contiguous set for our validation, because much like, video frames, if we have adjacent frames, one in training, one in valid, then our model is not learning anything - it is cheating
      • Any other explanations? Is 6% enough?

Collaborative Filtering:

  • How do I differentiate between when to use collaborative filtering vs tabular?
    • A thought experiment. Taking the ‘US Salary’ example of Tabular, could I instead run Collaborative Filtering on that and come up with a recommendation for a salary?
    • Basic intuition for this is to look at it as:
      • Tabular :: Supervised
      • Collaborative Filtering :: Unsupervised
  • What are n_factors?
    • They are the hidden features that the model learns after training
      • For example, deciding that some movies are family-friendly vs others not. Family-friendliness is one of the n_factors.
    • So, while we set up the learner, is the number of n_factors we choose one of the hyperparameters?
      • It could affect speed and accuracy, but need more experiments to determine.

Resources

Jeremy’s tweet on Tabular:

6 Likes

Awesome work @shimsan

1 Like

Thank you @shimsan!

1 Like

Just a reminder, we are having a meetup tomorrow(Sunday) at 4PM GMT. We will focus on projects showcase. This is the time for you to show off all your cool projects/get inspiration from others :slightly_smiling_face: To join just use the same zoom link when the time will come.

1 Like

The meetup will start in ~15 mins :partying_face: Join zoom !

Overview of Gradient Descent

What is Gradient Descent(GD)?

  • It is a type of optimization algorithm to find the minimum of a function (loss function in NN).

Nice Analogy for understanding GD :

  • A person stuck in the mountain & trying to get down with minimal visibility due to fog (Source : Wikipedia).

Algorithm

Source: [1]

Variants of Gradient Descent

Source [2]

  • Stochastic Gradient Descent: weights updated using one sample at a time hence batch_size is 1, for 100 samples, weights updated 100 times
  • Batch Gradient Descent: weight updated using the whole dataset, for 100 samples, weight updated only once
  • Mini Batch: middle ground and combination of the above two. Splits the dataset into the batch size of samples of our choice & chosen at random

[1] https://medium.com/@divakar_239/stochastic-vs-batch-gradient-descent-8820568eada1
[2] https://suniljangirblog.wordpress.com/2018/12/13/variants-of-gradient-descent/

I hope this clarifies the different variants of gradient descent.