Lesson 13 in-class

rachel · April 4, 2017, 2:27am

@prabu Yes. Google used attention for picking out house numbers from Google Street Images, and then used digit recognition on the numbers to determine addresses.

prabu · April 4, 2017, 2:28am

Cool. Thanks @rachel.

harveyslash · April 4, 2017, 2:28am

Attention models are relatively new. Are neural networks used by apps like Google assistant (for speech recognition) or is it some other system?

cody · April 4, 2017, 2:30am

Do you know if the current work strictly uses the hidden states of the final RNN layer, or have people tried using attention sums over different indices over multiple layers (the way we did with using perceptual loss from multiple layers)?

rachel · April 4, 2017, 2:34am

@harveyslash Yes, Google uses deep learning for speech recognition. Vincent Vanhoucke, former tech lead of Google’s speech recognition and now at Google Brain, even created the Udacity TensorFlow/Deep learning course.

harveyslash · April 4, 2017, 2:36am

I have taken that course, but I don’t think they covered exotic systems like what we do here. They mostly glossed over simpler concepts.

rachel · April 4, 2017, 2:38am

@harveyslash Agreed, it’s a fairly short, high-level course.

rodgzilla · April 4, 2017, 2:46am

I did not get the ‘getting it to the right shape again’ part.Would it be possible to explain it again?

prabu · April 4, 2017, 2:46am

@jeremy You showed how you figured out/tested the tensor shapes? How did you debug the Attention class itself as a whole? Dont know if this a right question to ask in this thread

prabu · April 4, 2017, 2:47am

Are RNNs easier/cleaner in PyTorch? For. e.g would the Attention class have been relatively cleaner in PyTorch?

nima · April 4, 2017, 2:49am

I guess TimeDistributed doesn’t do that? Cause in that cause, it would have made sense to do TimeDistributed(Lambda()) right before the RNN to apply those calculations to the input, no?

taposh · April 4, 2017, 2:56am

Any specific reason why we used tanh non-linearity as opposed to sigmoid?

layla.tadjpour · April 4, 2017, 3:15am

You still have the audio popping/crackling problem from time to time

cody · April 4, 2017, 3:30am

Do you know if there are any training methods that capture the fact that “What is the population of Canada” and “What is Canada’s population”, are very nearly the same, and arguably the same/correct in meaning? My first instinct would be “use a vector representation instead of a binary to get your loss”, but it doesn’t occur to me that there’s an obviously sensible way to do that for sentences.

Haeley · April 4, 2017, 3:31am

could we translate between Chinese and English using method shown in class?

thunderingtyphoons · April 4, 2017, 3:53am

Can we do transfer learning on Densenet?

dennisobrien · April 4, 2017, 4:03am

Can you share the link to the densenet notebook we were going over at the end of the class? I can’t find it among all the class resources.

Even · April 4, 2017, 4:04am

I was super impressed and intrigued by the results Vincent got out of his style transfer implementation. Can’t wait to see more!

dennisobrien · April 4, 2017, 4:05am

Found it: https://github.com/fastai/courses/tree/master/deeplearning1/nbs

Edit: Oops, that is part 1.

Edit: Found this as well, but it doesn’t include lesson 13: http://files.fast.ai/part2/

Kjeanclaude · April 4, 2017, 2:06pm

Wait a bit @dennisobrien, it will be done during the week.