What foundational topics would you like to hear more about?

How to maintain backward compatibility for your existing models in production when you’re forced to upgrade Fast.ai in order to get new features or to fix bugs. For me, it’s been a nightmare having to retrain and redeploy production models when they suffered various errors then I check the forums and its says the solution is to upgrade fastai. However, when I upgrade them, the old learner I saved fails to load correctly and sometimes I have to change the syntax of my code. Then a few days later, I encounter a new error, check the forums , days, I need to update fastai again, and end up having to retrain my mofels yet again, ad infinitum.

I’d like to learn a more effective way to go about this.

I like that analogy :slight_smile:
When I hear “cross entropy”, my brain goes back to entropy vs. enthalpy in physics classes and that doesn’t make whole a lot of sense. Thank you for making it easier to remember!

1 Like

That does sound very annoying! You shouldn’t need to retrain your models - since we’re not doing things that need your weights to change. They’re still just plain pytorch models. Please do let me know if this happens again so we can better understand the issue and figure out how to deal with it.

3 Likes

There is a youtube video which explains entropy. Might be its helpful.

1 Like

I’m really trying to understand the source code by using the debugger and stepping through things, but I often times struggle to follow completely.

For instance, I’ve spent the last few nights trying to figure out how exactly the pct_start argument in OneCycleScheduler affects the learning rate shape. I know that it controls the number of interations where the learning rate is increasing, but I want to know how exactly it does that. As another example, I’m trying to figure out what transforms are applied to the response when tfm_y = True. I’m having trouble hooking into the right point in the library to even start the debugging exploration.

So in summary, I would be interested in knowing some tricks to better trace the fastai source code to get answers myself that aren’t necessarily in the documentation.

I also echo the requests for understanding callbacks better. I know they’re very powerful to enable customizations to the fitting process, but I have trouble understanding how they are managed by the CallBackHandler and when they actucally get executed.

1 Like

When I go to Kaggle many of the competitions are looking for the ROC curve.
I’ve seen videos and demos but I still don’t quite grasp the way to do it.

Is there any way to explain the best way to generate the ROC curve?

Not a great solution but I keep two Anaconda environments for fastai. One is a “bleeding edge” install that I keep updated for the Part 2 course. The other is for projects I’m working on that’s a few versions behind.

this guy also have an excellent way of explaining it and how it’s related to shannons information theory: https://www.youtube.com/watch?v=9r7FIXEAGvs

@hiromi: one problem with the loss functions you are considering is that they don’t have the first property that you mentioned – i.e. that the derivative w.r.t y vanishes for y = y_hat.

It doesn’t have to be zero there. It just needs to be a smaller number when they’re closer.

I really appreciated the *args and *kwargs explanation in lesson 10 :slight_smile:

I just found this fast.ai Wiki: Deep Learning Glossary - super helpful!

2 Likes

I was looking for a fast.ai wiki Deep Learning Glossary that students can edit and add terms as they are introduced in class. (Thus my excitement when I found the page linked in my last post.)

But now I’m confused, because I can’t login and it was migrated to this ML Cheatsheet page, which was last updated in 2017.

Can we have a fast.ai Deep Learning Glossary as an editable Topic in the part 2 section of the forum?

3 Likes

Yes! If you create one then I will wikify it and link to it from the official resources once it’s fleshed out a bit.

1 Like

Since there has been a lot of discussion around s/w engineering , i thought i will ask this.
what would be the best practices that can be followed in structuring a project ? in terms of

  1. Repository structure
  2. Maintaining package dependencies
  3. Using relative path vs absolute path
    etc

I found this link https://docs.python-guide.org/writing/structure/ useful.

1 Like

That error seemed to be happening a lot when plotting the learning rate graphs: Errors in loading state dict. At least from searching the forums here, there doesn’t seem to be much of a consensus in fixing this when it randomly occurs other than upgrading fastai. And when i upgtade fastai, the old models I’ve saved using learner.export fail. Maybe you can talk more about alternative ways of saving a model, say, using Pytorch weights in greater detail. It seems manually loading and exporting pytorch weights is a non-trivial thing and I dont recall it being discussed in part 1 (but please point me there if it was.)

Not sure if it’s the best solution but I rarely plot the learning rate curves anymore as a result of this. Especially since it sometimes takes even longer than fitting a single cycle. I start my training with a lr of 0.1 and monitor the first epoch closely. If the error blows up or at least doesn’t go down, I interrupt training and decrease LR in magnitudes of 10 until it does.

If you see it happen again please let me know - I’m not sure what that’s referring to and would like to make sure it doesn’t happen again!

It’s pretty trivial - have a look at the code we use in fastai for saving. The only thing we add to PyTorch is (optionally) saving optimizer state IIRC.

The loading state dict error seems to be happening a lot with smaller datasets. I can’t share the confidential data that’s causing it but I’m gonna try to reproduce it on my own datasets instead and update you on it.

1 Like

I saw this medium post a while ago which did a pragmatic job of explaining it, it might help: https://medium.com/@alexabate/i-did-something-boring-so-you-dont-have-to-9140ca46c84d

1 Like

Another topic sugestion: more material on object detection, specifically text from images.

I’m surprised there’s relatively few working examples of this out there on, say, Github compared to the thousands of cat vs dog classifiers and face generators even though this is a very pressing problem to solve in the business world. Maybe the problem is magnitudes more difficult?

Yes, I’ve looked at solutions like Tesseract (works great on actual docs but poorly on, say, signs and billboards) , AWS, Azure, etc but they’re all pretty mediocre right now.

3 Likes

I usually refer people to Google’s ML Kit when they want to do OCR on mobile devices, because it works quite well.

To see what’s involved in building your own OCR system using machine learning, check out this blog post from Dropbox: https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/

1 Like