Part 2 blogs

I’ve been trying to check out algorithmia hosting for a while… thanks!

1 Like

I’ve tried to explain everything I learned about Leslie Smith’s 1cycle policy and provide an example of super-convergence in this blog post accompanied with this notebook.

I’d love to hear your thoughts!

11 Likes

That’s so great. Only suggestion I have is to explicitly show the key results from the notebook in the blog post (i.e without super convergence, 93% on CIFAR 10 takes (x) epochs, but with 1cycle etc it takes (y) epochs).

Great suggestion, thanks!

Hi everyone,

this not really a blog post. Rather it’s a notebook where I am putting an idea out there that I’ve had for a few months now. I am working with weather and climate models and we always have the problem that we have dozens of tuning parameters, which are usually hand-tuned. I’ve been wondering whether it’s possible to write a dynamical model in PyTorch and use auto-differentiation to learn the parameters.

So I wrote a the famous Lorenz chaos model in PyTorch and the parameter learning works!

I am not sure whether this idea is obvious or whether this is actually a cool idea. So if any of you have any experience with dynamical systems, feel free to have a look. I’d love any kind of feedback! In any case, here it is:

5 Likes

I’ve been working on a blog post about semantic segmentation. Would like to hear feedback from you. Please feel free to put comments with highlights. Here is the edit link: https://medium.com/p/d8d6f6005066/edit

Cheers

Kerem

5 Likes

I like and read in one sitting, thanks. I just commented minor misspelling and link errors.

Still draft? Not published? I wanted to clap :stuck_out_tongue:

I saw your comments, thank you so much for helping :slight_smile:

Haha soon :slight_smile: After working with @parrt on rfpimp I learned that we should be more meticulous and patient.

1 Like

My first attempt at a blog post. I would love to get some feedback on this. Should I dive more into the details on the technical side of this? My target audience here would be somebody that may not know much about machine learning that is maybe interesting in the technology. I would love to hear some feedback on this!

3 Likes

Post is up now it’s specifically about Unet. Thanks everyone for feedback!

2 Likes

Yes I think a lot more technical details would be a good idea. Especially since people need to understand the constraints of such a system. Congrats on your progress so far!

When you talk about the technical details, do you think I should talk about the different pieces (embedding size, dropout information, etc) or more just make sure people understand the limitations of using something like this. This type of system, at least how it currently is, isn’t going to be able to generate meaningful SQL for you, but it can help build a much better prediction of the next word you are typing in your SQL query which can make a SQL developer much more efficient.

I might start a new post to describe how to build the model to help a more technical person and have this one be less technical, but give people a good idea of what types of things are possible. My goal of post number two would be to give somebody enough information to build the same thing that I have built (with everybody on here’s help, especially Hiromi.) from the description. The technical post will be much more of a challenge for me, but I will learn more too since I really have to know what I’m talking about if I am trying to share the information with other people.

Mainly conceptually what’s going on.

1 Like

Read your blogs and saw your notebooks @sgugger - they’re all awesome and I learnt a lot. Thank you!

4 Likes

Hi all! I’m working on a blog post about Stochastic Weight Averaging, a new training method I recently added to the fast.ai library, and I wanted to get some feedback here before publishing and sharing with the rest of the world: https://medium.com/@hortonhearsafoo/adding-a-cutting-edge-deep-learning-training-technique-to-the-fast-ai-library-2cd1dba90a49

Appreciate any comments!

7 Likes

I think it’s great! The only minor suggestions I have are:

  • In the introduction, include a clear summary of the impact you observed of SWA, eg something like (making the numbers up here):
    • “I saw a decrease in error from 5% to 4.5%, a relative improvement of 10%”
    • “I was able to replicate the SGD error in 15% less time using SWA.”

The 2nd of these may require running more experiments to find the best # epochs and hyperparams to minimize time to achieve a specific accuracy, but I think it would be a very useful result to show (assuming it is actually faster!)

The only other suggestion is at the end where you say your numbers are “higher”, instead say “better”, since often higher numbers are worse (when talking about loss) and you don’t want people to misunderstand! You might want to also include you hypotheses as to why they’re better (reflection padding?)

Nice job. I thought it was a good level of technical for me. I didn’t see a link to the actual paper which could be nice to have in the “the paper” section. That way if I want to refer back to the paper, I can go straight from your article. Small detail, but just something I noticed.

2 Likes

Good catch, I added a link to the paper

1 Like