Wiki: Lesson 7

In the lesson 7 video around 35m 44s, it is said that “Loss functions such as softmax are not happy receiving a rank 3 tensor”. And around 40m 14s, we talk about sending rank 3 tensor to F.log_softmax and specify dim to let it know which axis to do softmax over.

I checked @timlee’s note and @EricPB’s timeline and they both say something along the line of “softmax isn’t happy to accept rank3 tensor”. But is this supposed to be F.null_loss function instead of F.log_softmax? My understanding is that instead of creating the custom loss function like we did in lesson 6, this time, we changed the output shape to rank 2 tensor to use the PyTorch’s loss function as is. Is that correct?

Thank you!!

1 Like

Actually in pytorch 0.3 softmax can handle higher dimensional tensors - the dim argument is new. So it’s possible some of the info in this video is a little out of date.

But you’re might be right that there’s still an issue with nll_loss. Did you have a chance to look into this?

Having a problem with the “accuracy(*learn.TTA())” cell 41 of the lesson 7 notebook, it gives me this

TypeError: torch.max received an invalid combination of arguments - got (numpy.ndarray, dim=int), but expected one of:

  • (torch.FloatTensor source)
  • (torch.FloatTensor source, torch.FloatTensor other)
    didn’t match because some of the keywords were incorrect: dim
  • (torch.FloatTensor source, int dim)
  • (torch.FloatTensor source, int dim, bool keepdim).

If I use the new “accuracy_np(*learn.TTA())” instead, I get this

AttributeError: ‘bool’ object has no attribute ‘mean’ .

Forgive my lack of comprehension, the function was modified without this notebook being amended yet? Thanks!

1 Like

Hi @Gius,
I think the error here is clearly due to the lack of mean attribute.
So you should rewrite it a bit as in previous lectures:

log_preds,y = learn.TTA()
preds = np.mean(np.exp(log_preds),0)
accuracy(preds,y)
3 Likes

Hi,
lesson7-cifar10
I do not find the RandomFlipXY() function in the ‘fastai/transforms.py’.
And I got a ‘not found error’ during my notebook execution.
So I replaced it by the RandomFlip() function to do my job from top to bottom without any problem.
Please, could someone provide more information about the RandomFlipXY()?

it looks like it was a change 25 days ago. they just removed the XY from the name.

3 Likes

OK, thank you @davecazz for the precision.
So there is nothing wrong in what I did.

One technique that can be helpful, especially as the fast.ai codebase evolves past what is illustrated in these lessons is to use the blame and history buttons in github. blame will show you the commit associated with each line of code in the file. so you can see when it was changed and what the code was before.

history will show you all the commits for that file and what changed each time. this is also helpful to see what the code looked like before in case something changed that is no longer compatible with the notebook. Although for the most part, they are good at keeping the notebooks up to date but the code Jeremy shows in the video can easy get out of date.

4 Likes

I am having this same problem. Did you ever find a solution?

Note the md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=3) line will error (slice returned empty tensor) if any lines in the file are empty. I found that the way I split the trn and val up I had empty lines only in the training so it seemed sensible to remove them, but it’d probably be better to tweak the fast.ai library to not error on them.

You can use rep -cvP '\S' trn/trn.txt to count the lines with whitespace, and sed -i '/^$/d' trn/trn.txt to remove them for now.

Never mind. I see that @Kjeanclaude’s suggestion combined with accuracy_np fixes the problem. Thanks!

1 Like

How to implement this to LSTM
m.rnn.weight_hh_l0.data.copy_(torch.eye(n_hidden))

Jeremy, can you make tutorial more about dataloader for nlp tasks thank :slight_smile:

I’m not sure if its just me but the images were not in the directories like the library was expecting. After downloading and renaming the top level directory to cifar10, I ran this script in each of the test and train folders to put them into labeled directories.

declare -a classes=(“plane” “automobile” “bird” “cat” “deer” “dog” “frog” “horse” “ship” “truck”) && for i in ${classes[@]}; do mkdir ${i} && mv *${i}.png ${i} ; done

1 Like

It’s not just you, I’m also having the same issue.

This is how the dataset layout looks:

$ curl -sL http://pjreddie.com/media/files/cifar.tgz | tar -tzf- | head
cifar/
cifar/test/
cifar/test/3661_automobile.png
cifar/test/4572_bird.png
cifar/test/416_airplane.png
cifar/test/4863_automobile.png
cifar/test/9523_dog.png
cifar/test/3612_automobile.png
cifar/test/9090_bird.png
cifar/test/3443_deer.png

I’ve wanted to make a PR, but it seemed somewhat intrusive to make a pull request for the notebook.

Here is my Python solution:

def to_label_subdirs(path, subdirs, classes, labelfn):
    for sd in subdirs:
        for rf in os.listdir(os.path.join(path, sd)):
            af = os.path.join(path, sd, rf)
            if not os.path.isfile(af):
                continue
            lb = labelfn(rf)
            if not lb:
                continue
            os.renames(af, os.path.join(path, sd, lb, rf))

Then, somewhere before the definition of get_data:

to_label_subdirs(PATH, 'train test'.split(), classes, lambda f: f[f.find('_')+1 : f.find('.')])

Also, It’s a good idea to ensure that we’ve moved all images to their corresponding label directories:

!find {PATH}train {PATH}test -maxdepth 1 -type f | wc -l

The output should be 0.

5 Likes

Now it’s not so nice because other threads pushed the wiki threads out of the top.

@rachel Is it possible to pin the wiki threads? They are very important for the course, but now one should search the forums to get to them. It can be quite confusing for people that are just starting the course (it certainly was confusing for me). There are links to wikis from the pages with video lectures, but I guess it would be nice to have them organized in one place.

The other option is to create an index page with links to the wikis, and pin the index page instead.

Thanks for this trick !
Bruno

and use accuracy_np also !
Thanks

Can someone explain it at https://youtu.be/H3g26EVADgY?t=5340
Why SGD will undo the normalization while BatchNorm works?

@jeremy Could you elaborate that part a bit. After reading some articles, I get the intuition that why adding extra parameters could help in BatchNorm. But I still couldn’t get what do you mean when you are saying

  1. SGD will undo it
  2. Why adding scaling parameters address this “undo” issue.