Lesson 12 in class

harveyslash · March 28, 2017, 3:27am

so its possible for a badly trained network to just continue the output and fill out the entire max len output?

cody · March 28, 2017, 3:28am

Is it possible within this dataset to separate out proper nouns from non-proper nouns, and check accuracy on each group? Since it seems like proper nouns (like McConville, or Missoula) may have more non-standard spellings if they are basically relics of former spellings of things.

kpatnaik · March 28, 2017, 3:30am

Q: Do you think trying beam search instead of replicating the last layer might result in better results.

rachel · March 28, 2017, 3:30am

@harveyslash yes, that could happen

rachel · March 28, 2017, 3:33am

@cody that’s a reasonable hypothesis. you could probably use a dictionary dataset to label nouns as improper/proper in the original training set

Even · March 28, 2017, 3:50am

Won’t the weightings be heavily impacted by the padding done to the input set?

kpatnaik · March 28, 2017, 3:53am

is “a” shared among all i-j pairs ? or do we train a separate alignment for each pair ?

jeremy · March 28, 2017, 6:21pm

That was our friends from Kaiser - thanks for the offer; I’ll definitely take you up on that as more progress is made.

jeremy · March 28, 2017, 6:21pm

Oops - I forgot to discuss this! Will do next week.

jeremy · March 28, 2017, 6:22pm

We’ll discuss beam search next week.

EricPB · August 18, 2017, 7:33pm

Note: the complete collection of Part 2 video timelines is available in a single thread for keyword search.
Part 2: complete collection of video timelines

Here’s the Lesson 12 video timeline, probably the most theoretical lesson so far

Lesson 12 video timeline:

00:00:05 K-means clustering in TensorFlow

https://youtu.be/jy1w0mPCHb0?t=5s

00:06:00 ‘find_initial_centroids’, a simple heuristic

https://youtu.be/jy1w0mPCHb0?t=6m

0012:30 A trick to make TensorFlow feel more like Pytorch
& other tips around Broacasting, GPU tensors and co.

https://youtu.be/jy1w0mPCHb0?t=12m30s

00:24:30 Student’s question about “figuring out the number of clusters”

https://youtu.be/jy1w0mPCHb0?t=24m30s

00:26:00 “Step 1 was to copy our initial_centroids and copy them into our GPU”,
"Step 2 is to assign every point and assign them to a cluster "

https://youtu.be/jy1w0mPCHb0?t=26m

00:29:30 ‘Dynamic_partition’, one of the crazy GPU functions in TensorFlow

https://youtu.be/jy1w0mPCHb0?t=29m30s

00:37:45 Digress: “Jeremy, if you were to start a company today, what would it be ?”

https://youtu.be/jy1w0mPCHb0?t=37m45s

00:40:00 Intro to next step: NLP and translation deep-dive, with CMU pronouncing dictionary
via spelling_bee_RNN.ipynb

https://youtu.be/jy1w0mPCHb0?t=40m

00:55:15 Create spelling_bee_RNN model with Keras

https://youtu.be/jy1w0mPCHb0?t=55m15s

01:17:30 Question: "Why not treat text problems the same way we do with images’ ? "

https://youtu.be/jy1w0mPCHb0?t=1h17m30s

01:26:00 Graph for Attentional Model on Neural Translation

https://youtu.be/jy1w0mPCHb0?t=1h26m

01:32:00 Attention Models (cont.)

https://youtu.be/jy1w0mPCHb0?t=1h32m

01:37:20 Neural Machine Translation (research paper)

https://youtu.be/jy1w0mPCHb0?t=1h37m20s

01:44:00 Grammar as a Foreign Language (research paper)

https://youtu.be/jy1w0mPCHb0?t=1h44m