Lesson 12 (2019) discussion and wiki

probably using embeddings, but not sure…

Along the same lines of what Jeremy is talking about right now - is it possible to have a TDD approach when doing deep learning?

1 Like

Say more…
Linear combinations of numerics, datetimes, and one-hot encoded categorical variables seem pretty straightforward, no?

3 Likes

Is there any work around building open source simulators/debuggers that can help debug the kinds of issues that Jeremy just described without having to spend $$$?

I don’t even know if this is possible! Just curious.

1 Like

Quote of the day (month? year?):
“Training models sucks. And deep learning is a miserable experience and you shouldn’t do it.” - Jeremy Howard

8 Likes

Reminder to upvote & ask questions here for the last 10 minutes of class:

For what it’s worth, I’ve heard Jeremy say this several times now: he’s worked a lot with TDD in the past, but in DL he seems to prefer working with notebooks which is in itself a kind of micro-TDD, if you have the discipline to check everything as you progress like he just mentioned.

6 Likes

What does Jeremy mean by scientific journal? Is it a file where all code goes by date? What is the best way to keep that?

5 Likes

What does Jeremy mean when he says “lab notes”? Is it a physical paper notebook he records stuff on? Or is there some software he’s using to record experiments/results/progress?

Edit: @maxim.pechyonkin same question at the same time! :raised_hands:

2 Likes

I guess it is more about using some digital notebook, i.e. OneNote or Evernote maybe. To easily dump the files, attach logs, diagrams, etc.

2 Likes

I think it’s just a traditional text file with the resutls of the experiments copy-pasted.

3 Likes

Not really. I just see combining .3 of the Feb 2nd and .7 of the Feb 15th for Feb 11th… its probably just the 2 combined images that is confusing me.

Any sort of version control system is essential, even for teams of 1. That way you can go back to any point in the history of your project. And sometimes you need to.

I use Evernote for this exact purpose. Every research project has a notebook. I keep records of changes, plans for future changes, and try to record results along the way.

3 Likes

Notion is also a fabulous tool for this.

5 Likes

What about just cloning notebooks after each experiment and keeping .ipynb files in version control system? I am going with this strategy and it seems to be working so far. Is there a reason why copying results into a text file would be better?

“That we’ll transfer in a future lesson” could become a meme of this course :wink:

As I understand it, it’s a linear combination of things in vector space. Images are already tensors so that’s easy. For categorical data you would need to convert all categoricals to embeddings and whatnot.

So if you have day_of_month and month categoricals you would need to pass them through embeddings first.

2 Likes

transformer xl has state

Regarding applying mixup to NLP or other fields where you have categorical inputs, is it necessary to have pretrained embeddings first to do the mixups on? It doesn’t seem obvious to me how to do mixup when you’re dynamically learning the embedding weights as well. And you certainly don’t wanna do mixup on one-hot encoded inputs before embedding them.

2 Likes