About the fastai dev category

jeremy · July 19, 2018, 5:09pm

This category is for discussion of development of fastai.

Here are some ways that you can learn a lot about the library, whilst also contributing to the community:

Document something that is currently undocumented.
Add an example of use to the docs for something that doesn’t currently have an example of use. We’d like everything in the docs to include an actual piece of working code demonstrating it.

Note for new contributors

It can be tempting to jump in to a new project by questioning stylistic decisions that have been made, such as naming, formatting, and so forth. Especially so for python programmers coming to this project, which is unusual in following a number of conventions that are common in other programming communities, but not in Python. However, please don’t do this, for (amongst others) the following reasons:

Contributing to Parkinson’s law of triviality has negative consequences for a project. Let’s focus on deep learning!
It’s exhausting to repeat the same discussion over and over again, especially when it’s been well documented already
You’re likely to get a warmer welcome from the community if you start out by contributing something that’s been requested on the forum, since you’ll be solving someone’s current problem
If you start out by just telling us your point of view, rather than studying the background behind the decisions that have been made, you’re unlikely to be contributing anything new or useful
I’ve been writing code for nearly 40 years now, across dozens of languages, and other folks involved have quite a bit of experience too - the approaches used are based on significant experience and research. Whilst there’s always room for improvement, it’s much more likely you’ll be making a positive contribution if you spend a few weeks studying and working within the current framework before suggesting wholesale changes.

wgpubs · July 20, 2018, 5:47pm

Suggestion: Name the environment something like fastai-v1 so as to not affect folks using the current framework.

I’m glad to submit a PR so lmk.

-Wayde

wdhorton · July 21, 2018, 5:53pm

I’d like to help out! And I’m comfortable with the prerequisites you mentioned. I took a look at the commit history and I see that you and Sylvain are working on transforms/augmentation. Do you need more help in that area, or is there something else that you’re looking for someone to take on?

lesscomfortable · July 22, 2018, 2:40am

Idem except not too comfortable with OO and functional programming in Python (know the basics but not much coding hours spent doing it). However I see this as a great opportunity to learn. Please let me know where I can help!

maw501 · July 23, 2018, 8:27pm

Hello @jeremy,

Sounds great…any idea on a ballpark completion date?

Context: it will take me perhaps ~200 hours (~10 weeks) to get up to speed given I haven’t done course 2 yet and am not familar with the fastai internals. But after that if you aren’t done I’ll have a lot more time.

Thanks,

Mark

jeremy · July 24, 2018, 1:27am

Plan is to try to release by mid October. @wdhorton @lesscomfortable start reading through the notebooks from 00, 00b, etc onwards, and see if you can get them working and you understand what’s going on. Then in a few days we should be ready to get some help!

nickl · July 24, 2018, 3:43am

This is interesting.

Can I suggest thinking carefully and defining the goals, in particular the relationship with PyTorch?

I’ve gone reasonably deep into the library (have done one PR, and another one currently being reviewed for the ULMFiT stuff), and I’ve been confused about why some things are in PyTorch and some are in Fastai. I’m sure I’m not the only one.

From the FastAI course point of view I found it pretty challenging. I understand the FastAI library pretty well, but not PyTorch, and when I want to do something that is outside the library I don’t know how to do it.

I didn’t find the same issue with the 2017 course with Keras. I could jump straight into Keras and do what I needed to do.

GeeferUK · July 24, 2018, 4:24pm

Just some feedback regarding the Fastai Abbreviation Guide (abbr.md).

Is using a trailing underscore to denote internal properties and methods not potentially a source of confusion? There is a fairly common python convention that a leading underscore is used for this purpose and in addition PyTorch uses a trailing underscore to indicate an ‘in-place’ operation. Since fastai is built on top of PyTorch there could be an argument for not ‘overloading’ the convention with a different purpose in fastai.

Clearly not the end of the world but I thought I would raise it at this early stage.

jeremy · July 24, 2018, 6:11pm

Yes we’ll be using leading underscore now.

jeremy · July 24, 2018, 6:12pm

Yup we’re working on that explicitly. Check out the fastai_v1/001a notebook, where we show exactly what’s in torch.nn and why. Then the next notebooks will gradually introduce the additions made by fastai and show why.

nickl · July 24, 2018, 11:33pm

This is perfect.

karanchahal · July 25, 2018, 5:14pm

I’d love to help out. I am familiar with OO concepts and writing clean tesatable code.
I’m not too familiar with the fastai library however.

Can we start by outlining a basic roadmap of features to deliver ?
We can divide work that way and hopefully get more stuff done. We can use the projects feature of Github to write issues and assign , track and so on and so forth. I have used them in my personal projects and they’re great for collabarating with teams. Your personal “Jira” . This kanban board would be great for newcomers to join in.

Also we can use the Github wiki pages to update content/ documentation.

This is an example on how we can go on about maybe ?

Can we have some list of issues on the repo,so we can start work with something and start contributing ?
Sorry for the long post. Building well tested production grade stuff really excites me

MicPie · July 25, 2018, 7:43pm

This looks amazing for learning the fastai library with pytorch! Thank you!

I was going through it and made some annotations for myself and shared it on GitHub. Maybe this is also interesting for others.

Best regards
Michael

jeremy · July 25, 2018, 8:00pm

I’m afraid not - we’re doing research as we go and the development path is highly path dependent on that. So this won’t look at all like the usual enterprise software development project you’re used to. We’ll post here when there are specific tasks we need help with.

jeremy · July 25, 2018, 8:01pm

That’s a great idea. You might find it even more helpful if you used markdown cells instead of comments for much of your notes, so you can add formatting, links, etc.

jsa169 · July 26, 2018, 3:31am

I’m very excited about this initiative and I happen to have a lot of time on my hands. I’m currently part-time at my day job as a software engineer, which I did just to do a deep dive into deep learning (and I totally do not regret it!). I’m on part 2 lesson 11 right now, so I’m pretty far along and I’ve dug quite a bit into the existing code base.

What I’m probably most useful for: I’ve been in the business for a decade as a software engineer on big (enterprisey) code bases, and I’m big on making code readable, well tested, correct, and easy to reason about. So if you have some research code that you want to solidify into something solid, I can help. I’ll try to keep tabs on this forum page but ping me with a direct message if I don’t respond in a timely fashion.

xyz · July 29, 2018, 6:46am

Hi,
since fastai is doing some plotting, I just wanted to mention a library a really love right now:
holoviews.org which is the main part of the pyviz.org project. (scipy tutorial )
I think that would simplify some things (one could change the style of a visualization after creation of an object) and allow cool stuff (live updating curves during training for example). Just some points about it:

built on top of matplotlib and bokeh (plotly is experimental)
unifying the plotting language, it’s relatively easy to change backends
easier interactivity
rather describe your data than your plot

So just wanted to mention it and maybe some opinions.
Thanks

wdhorton · July 29, 2018, 2:55pm

@sgugger what are the specs of the machine you used for the Cifar10-comparison-pipelines notebook? I’m getting slower times on my home rig (30 minutes for the “Standard DawnBench result with one GPU” run vs. the 22min47sec it says in the notebook), trying to figure out what might be the reason.

sgugger · July 29, 2018, 3:25pm

All times reported are on a p3 instance.

wdhorton · July 29, 2018, 11:49pm

Makes me feel better, I’m not lucky enough to have a Tesla V100 at home