so its possible for a badly trained network to just continue the output and fill out the entire max len output?
Is it possible within this dataset to separate out proper nouns from non-proper nouns, and check accuracy on each group? Since it seems like proper nouns (like McConville, or Missoula) may have more non-standard spellings if they are basically relics of former spellings of things.
Q: Do you think trying beam search instead of replicating the last layer might result in better results.
@harveyslash yes, that could happen
@cody that’s a reasonable hypothesis. you could probably use a dictionary dataset to label nouns as improper/proper in the original training set
Won’t the weightings be heavily impacted by the padding done to the input set?
is “a” shared among all i-j pairs ? or do we train a separate alignment for each pair ?
That was our friends from Kaiser - thanks for the offer; I’ll definitely take you up on that as more progress is made.
Oops - I forgot to discuss this! Will do next week.
We’ll discuss beam search next week.
Note: the complete collection of Part 2 video timelines is available in a single thread for keyword search.
Part 2: complete collection of video timelines
Here’s the Lesson 12 video timeline, probably the most theoretical lesson so far
Lesson 12 video timeline:
00:00:05 K-means clustering in TensorFlow
00:06:00 ‘find_initial_centroids’, a simple heuristic
0012:30 A trick to make TensorFlow feel more like Pytorch
& other tips around Broacasting, GPU tensors and co.
00:24:30 Student’s question about “figuring out the number of clusters”
00:26:00 “Step 1 was to copy our initial_centroids and copy them into our GPU”,
"Step 2 is to assign every point and assign them to a cluster "
00:29:30 ‘Dynamic_partition’, one of the crazy GPU functions in TensorFlow
00:37:45 Digress: “Jeremy, if you were to start a company today, what would it be ?”
00:40:00 Intro to next step: NLP and translation deep-dive, with CMU pronouncing dictionary
00:55:15 Create spelling_bee_RNN model with Keras
01:17:30 Question: "Why not treat text problems the same way we do with images’ ? "
01:26:00 Graph for Attentional Model on Neural Translation
01:32:00 Attention Models (cont.)
01:37:20 Neural Machine Translation (research paper)
01:44:00 Grammar as a Foreign Language (research paper)