Wiki: Lesson 7

rachel · January 2, 2018, 11:49pm

<<< Wiki: Lesson 6

Lesson resources

Video timeline

00:03:04 Review of last week lesson on RNNs,
Part 1, what to expect in Part 2 (start date: 19/03/2018)
00:08:48 Building the RNN model with ‘self.init_hidden(bs)’ and ‘self.h’, the “back prop through time (BPTT)” approach
00:17:50 Creating mini-batches, “split in 64 equal size chunks” not “split in chunks of size 64”, questions on data augmentation and choosing a BPTT size, PyTorch QRNN
00:23:41 Using the data formats for your API, changing your data format vs creating a new dataset class, ‘data.Field()’
00:24:45 How to create Nietzsche training/validation data
00:35:43 Dealing with PyTorch not accepting a “Rank 3 Tensor”, only Rank 2 or 4, ‘F.log_softmax()’
00:44:05 Question on ‘F.tanh()’, tanh activation function,
replacing the ‘RNNCell’ by ‘GRUCell’
00:47:15 Intro to GRU cell (RNNCell has gradient explosion problem - i.e. you need to use low learning rate and small BPTT)
00:53:40 Long Short Term Memory (LSTM), ‘LayerOptimizer()’, Cosine Annealing ‘CosAnneal()’
01:01:47 Pause
01:01:57 Back to Computer Vision with CIFAR 10 and ‘lesson7-cifar10.ipynb’ notebook, Why study research on CIFAR 10 vs ImageNet vs MNIST ?
01:08:54 Looking at a Fully Connected Model, based on a notebook from student ‘Kerem Turgutlu’, then a CNN model (with Excel demo)
01:21:54 Refactored the model with new class ‘ConvLayer()’ and ‘padding’
01:25:40 Using Batch Normalization (BatchNorm) to make the model more resilient, ‘BnLayer()’ and ‘ConvBnNet()’
01:36:02 Previous bug in ‘Mini net’ in ‘lesson5-movielens.ipynb’, and many questions on BatchNorm, Lesson 7 Cifar10, AI/DL researchers vs practioners, ‘Yann Lecun’ & ‘Ali Rahimi talk at NIPS 2017’ rigor/rigueur/theory/experiment.
01:50:51 ‘Deep BatchNorm’
01:52:43 Replace the model with ResNet, class ‘ResnetLayer()’, using ‘boosting’
01:58:38 ‘Bottleneck’ layer with ‘BnLayer()’, ‘ResNet 2’ with ‘Resnet2()’, Skipping Connections.
02:02:01 ‘lesson7-CAM.ipynb’ notebook, an intro to Part #2 using ‘Dogs v Cats’.
02:08:55 Class Activation Maps (CAM) of ‘Dogs v Cats’.
02:14:27 Questions to Jeremy: “Your journey into Deep Learning” and “How to keep up with important research for practioners”,
“If you intend to come to Part 2, you are expected to master all the techniques in Part 1”, Jeremy’s advice to master Part 1 and help new students in the incoming MOOC version to be released in January 2018.

Borz · January 3, 2018, 12:51am

@rachel just wanted to share how accessible and ordered the forum looks right now with these wikis:

Thanks for puttings these up, even long after a lesson they’re really a big help.

hiromi · January 9, 2018, 2:35pm

In the lesson 7 video around 35m 44s, it is said that “Loss functions such as softmax are not happy receiving a rank 3 tensor”. And around 40m 14s, we talk about sending rank 3 tensor to F.log_softmax and specify dim to let it know which axis to do softmax over.

I checked @timlee’s note and @EricPB’s timeline and they both say something along the line of “softmax isn’t happy to accept rank3 tensor”. But is this supposed to be F.null_loss function instead of F.log_softmax? My understanding is that instead of creating the custom loss function like we did in lesson 6, this time, we changed the output shape to rank 2 tensor to use the PyTorch’s loss function as is. Is that correct?

Thank you!!

jeremy · January 27, 2018, 6:25pm

Actually in pytorch 0.3 softmax can handle higher dimensional tensors - the dim argument is new. So it’s possible some of the info in this video is a little out of date.

But you’re might be right that there’s still an issue with nll_loss. Did you have a chance to look into this?

Gius · February 5, 2018, 1:23am

Having a problem with the “accuracy(*learn.TTA())” cell 41 of the lesson 7 notebook, it gives me this

TypeError: torch.max received an invalid combination of arguments - got (numpy.ndarray, dim=int), but expected one of:

(torch.FloatTensor source)
(torch.FloatTensor source, torch.FloatTensor other)
didn’t match because some of the keywords were incorrect: dim
(torch.FloatTensor source, int dim)
(torch.FloatTensor source, int dim, bool keepdim).

If I use the new “accuracy_np(*learn.TTA())” instead, I get this

AttributeError: ‘bool’ object has no attribute ‘mean’ .

Forgive my lack of comprehension, the function was modified without this notebook being amended yet? Thanks!

Kjeanclaude · February 9, 2018, 1:44pm

Hi @Gius,
I think the error here is clearly due to the lack of mean attribute.
So you should rewrite it a bit as in previous lectures:

log_preds,y = learn.TTA()
preds = np.mean(np.exp(log_preds),0)
accuracy(preds,y)

Kjeanclaude · February 9, 2018, 2:49pm

Hi,
lesson7-cifar10
I do not find the RandomFlipXY() function in the ‘fastai/transforms.py’.
And I got a ‘not found error’ during my notebook execution.
So I replaced it by the RandomFlip() function to do my job from top to bottom without any problem.
Please, could someone provide more information about the RandomFlipXY()?

davecazz · February 10, 2018, 3:15am

it looks like it was a change 25 days ago. they just removed the XY from the name.

Kjeanclaude · February 10, 2018, 9:35am

OK, thank you @davecazz for the precision.
So there is nothing wrong in what I did.

davecazz · February 11, 2018, 9:22pm

One technique that can be helpful, especially as the fast.ai codebase evolves past what is illustrated in these lessons is to use the blame and history buttons in github. blame will show you the commit associated with each line of code in the file. so you can see when it was changed and what the code was before.

history will show you all the commits for that file and what changed each time. this is also helpful to see what the code looked like before in case something changed that is no longer compatible with the notebook. Although for the most part, they are good at keeping the notebooks up to date but the code Jeremy shows in the video can easy get out of date.

travis · February 18, 2018, 2:04pm

I am having this same problem. Did you ever find a solution?

mcintyre1994 · February 18, 2018, 4:11pm

Note the md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=3) line will error (slice returned empty tensor) if any lines in the file are empty. I found that the way I split the trn and val up I had empty lines only in the training so it seemed sensible to remove them, but it’d probably be better to tweak the fast.ai library to not error on them.

You can use rep -cvP '\S' trn/trn.txt to count the lines with whitespace, and sed -i '/^$/d' trn/trn.txt to remove them for now.

travis · February 19, 2018, 3:55pm

Never mind. I see that @Kjeanclaude’s suggestion combined with accuracy_np fixes the problem. Thanks!

kiay123 · February 26, 2018, 8:01am

How to implement this to LSTM
m.rnn.weight_hh_l0.data.copy_(torch.eye(n_hidden))

kiay123 · February 26, 2018, 12:34pm

Jeremy, can you make tutorial more about dataloader for nlp tasks thank

jmoney · February 28, 2018, 8:02pm

I’m not sure if its just me but the images were not in the directories like the library was expecting. After downloading and renaming the top level directory to cifar10, I ran this script in each of the test and train folders to put them into labeled directories.

declare -a classes=(“plane” “automobile” “bird” “cat” “deer” “dog” “frog” “horse” “ship” “truck”) && for i in {classes[@]}; do mkdir {i} && mv *{i}.png {i} ; done

emilmelnikov · March 4, 2018, 2:57pm

It’s not just you, I’m also having the same issue.

This is how the dataset layout looks:

$ curl -sL http://pjreddie.com/media/files/cifar.tgz | tar -tzf- | head
cifar/
cifar/test/
cifar/test/3661_automobile.png
cifar/test/4572_bird.png
cifar/test/416_airplane.png
cifar/test/4863_automobile.png
cifar/test/9523_dog.png
cifar/test/3612_automobile.png
cifar/test/9090_bird.png
cifar/test/3443_deer.png

I’ve wanted to make a PR, but it seemed somewhat intrusive to make a pull request for the notebook.

Here is my Python solution:

def to_label_subdirs(path, subdirs, classes, labelfn):
    for sd in subdirs:
        for rf in os.listdir(os.path.join(path, sd)):
            af = os.path.join(path, sd, rf)
            if not os.path.isfile(af):
                continue
            lb = labelfn(rf)
            if not lb:
                continue
            os.renames(af, os.path.join(path, sd, lb, rf))

Then, somewhere before the definition of get_data:

to_label_subdirs(PATH, 'train test'.split(), classes, lambda f: f[f.find('_')+1 : f.find('.')])

Also, It’s a good idea to ensure that we’ve moved all images to their corresponding label directories:

!find {PATH}train {PATH}test -maxdepth 1 -type f | wc -l

The output should be 0.

emilmelnikov · March 5, 2018, 10:45am

Now it’s not so nice because other threads pushed the wiki threads out of the top.

@rachel Is it possible to pin the wiki threads? They are very important for the course, but now one should search the forums to get to them. It can be quite confusing for people that are just starting the course (it certainly was confusing for me). There are links to wikis from the pages with video lectures, but I guess it would be nice to have them organized in one place.

The other option is to create an index page with links to the wikis, and pin the index page instead.

bruno16 · March 10, 2018, 5:05pm

Thanks for this trick !
Bruno

bruno16 · March 10, 2018, 7:11pm

and use accuracy_np also !
Thanks

Wiki: Lesson 7

Lesson resources

Other links

Video timeline