Lesson 6 In-Class Discussion ✅

jcatanza · November 28, 2018, 3:42am

Yes, the point is to add to the training data with the augmented images.

sgugger · November 28, 2018, 3:43am

We usually use reflection padding (Jeremy showed black so that we could really see what was going on), which is mirroring.

radikubwa · November 28, 2018, 3:43am

Does that mean I can use less images in my training set? Because this technique is helped me in one of my projects since I could store all the images in my workspace i.e Data augmentation.

jcatanza · November 28, 2018, 3:43am

would be nice…

PierreO · November 28, 2018, 3:43am

Do you mean in addition of specifying Float ?

KarlH · November 28, 2018, 3:43am

If you look at the notebook you’ll see we don’t actually train a model with the data subset. We use the entire dataset. Also the goal of dropout isn’t to make the model faster, it’s to improve generalization.

sam2 · November 28, 2018, 3:44am

I hope Jeremy talks about mixup augmentation

radikubwa · November 28, 2018, 3:44am

Me too. I think Moustapha Cisse from facebook introduced that in deep learning indaba this year.

wonderz44 · November 28, 2018, 3:45am

@sgugger
the code in rossman_data_clean specifies loading multiple files: table_names = [‘train’, ‘store’, ‘store_states’, ‘state_names’, ‘googletrend’, ‘weather’, ‘test’]

but only train test and store are available on kaggle website. Where do we get the rest?

ArchieIndian · November 28, 2018, 3:47am

Has an activation map kind of technique been tried for structured/tabular data? To understand which variables move the classification from one class to another.

rachel · November 28, 2018, 3:47am

If you start a topic about weight norm in the advanced section, Jeremy will answer it there (it will also be covered in part 2).

cedric · November 28, 2018, 3:47am

Someone asked the same question in lesson 4.

Check out my reply there?

I think, it’s not the best answer.

Taka · November 28, 2018, 3:47am

Check Sylvain’s post out: Mixup data augmentation

PierreO · November 28, 2018, 3:48am

About part 2, do you already know when it will take place ? (sorry if that’s already answered elsewhere but I didn’t find it)

jcatanza · November 28, 2018, 3:48am

Data augmentation can help fill the gaps in your image database. For example, if your database has only clean images in a certain orientation, data augmentation can help your network learn to classify noisy images or distorted images, or rotated images.

dhananjay014 · November 28, 2018, 3:49am

For making custom models, is it a good choice to switch to Pytorch?

wdhorton · November 28, 2018, 3:50am

I’ve seen discussions on Kaggle where people use translation for data augmentation for text. Say you’re doing a text classification task in English, you can use Google Translate to do English -> Spanish (or another language) -> back to English and get augmented text data.

Aterwyn · November 28, 2018, 3:50am

Sometimes, I train a network which ends up predicting always the same class. Got a hard time finding out why, but it seems that it’s more or less fixed by reducing dropout.
Is it because by using the default dropout (ps=0.5), the network is not complex enough to properly use input features for my case? Or is it because of other reasons, i.e. strong classes imbalance (which is the case)
I’m trying to get some intuition about that. Has anyone already experienced this kind of issues?

wdhorton · November 28, 2018, 3:50am

What do you mean by “switch”? The models that fastai uses are Pytorch models.

rachel · November 28, 2018, 3:50am

Here is the info and dates for part 2: https://www.usfca.edu/data-institute/certificates/deep-learning-part-two