I’ve jotted down a draft of a fastai style guide. I’ve endeavored to make it somewhat educational and interesting, not just a bland list of requirements, so hopefully you like it!
I’d love to hear any feedback you have. (Feedback on the guide, that is, not on the particular coding style choices I’ve made - it’s fine for you to have different coding style preferences, but that’s not a particularly constructive thing to tell me about.)
Also, if you’re aware of any good research papers looking into the impacts of different coding styles, I’d love to hear about it.
Please don’t link to this on social media just yet - I’d like to iterate a bit based on your feedback first…
Thank you Jeremy. My feedback is related to Symbol naming. I was rereading fastai code after a pause we had between part1 and part2 and I found it quite difficult to recall what shortcuts mean. When you work for some time with lectures and code everything is fine. But when you get back to code after a pause, it might be a problem.
A glossary for frequently used abbreviations would be very helpful.
Great read, it’s definitely thought-provoking! Broadened my horizon, and finding it useful, especially using abbreviations for domain-specific terms, which by now I agree on. I personally still disagree with some of the readability suggestions, I guess the reason being is that when there is a standard out there, and all code that many of us read most of the day is formatted according to that standard, our eyes and minds become accustomed to it and are able to ignore the grammatical noise and infer meaning at a fast glance.
I’m currently thinking the gold is in a balance of both approaches - I find your style to be a bit on the extreme side, whilst also realizing my own overly expressive coding standards (influenced heavily by Robert “Uncle Bob” C. Martin) have a lot to gain from it.
Thanks for the link to the paper.
My usual struggle with notation style is mostly my own lack of fluent domain knowledge, to instantly expand abbreviations in-head.
Great initiative with the documentation for non-obvious abbreviations.
I hear you @sermakarevich , happens to me every time I get back from being ‘away’ for some days.
Yes this is suggested in the style guide
Understood. However I’m not really looking for feedback on the style choices I made (I’m not going to be changing them now, unless there’s some very compelling research that pops up that clearly shows I’ve made a significant error based on my project goals); I’m looking for feedback on how I’ve written the guide, eg:
- Have I forgotten to document something?
- Is any of the explanation unclear?
- Are there places that don’t currently have a code example that need one?
Sorry, I realize now my original post would have been more clear with this additional information…
I just wanted to say that this style guide is simply amazing. It speaks to the reader so kindly and one can learn so much from it. Planning to read the referenced materials over the coming days.
The fastai codebase and Jeremy’s commentary on why some things are the way they are has definitely been an eye opener to me. This is a completely new and fascinating world that someone brought up on PEP8 and its equivalents would never imagine existed. Especially as such voices seem to be nearly not heard in the programming main stream and the style guides are not questioned in any reasonable way.
Yesterday I started watching a video I came across through a reply to Jeremy’s tweet:
on the design thinking that went into building a compiler and on the synergies between the design and the style it was coded it. Watching the gentleman share his thoughts and present the code in notepad was amazing.
Here is the video if anyone might be interested
(BTW I do not know much about compilers, but still found the discussion fascinating and educational)
Now, I see, sorry Jeremy, I read your message in a wrong way. Had no intent to give non-constructive feedback.
I never thought that for a second
Oh yes, sorry, I’ve apparently missed this item:
If you find the abbreviations in a module non-obvious, feel free to add a list of them to the module’s markdown file in this docs folder (create one if needed) to help you (and the next coder) get oriented.
When submitting a PR on a notebook, don’t re-run the whole thing such that the diff ends up with changes for every bit of meta-data.
Could you expand this section a little bit? Maybe even give it a header like “Submitting diffs for Jupyter Notebooks”. How to put Jupyter Notebooks into git has always puzzled me a bit.
I think the best way is to backup the notebook, debug it until you got it working and then c&p the code bits you’ve changed into the backup and then make the diff on that, ignoring all the variables that might have been changed through debugging.
Don’t use an automatic linter. No automatic tool can lay out your code with the care and domain understanding that your can. And it’ll break all the care and domain understanding that previous contributors have used in that file!
I think you mean autoformatters instead of linters. Linters will annoy you with warnings but not change the code layout, unless you do it yourself to suppress these warnings. autoformatters will change the layout of code. Confusingly, in python PEP 8 refers to the standard library formatting style guide, pep8 is a linter package, and autopep8 is a formatter.
Thank you Jeremy for preparing this guide. It is really insightful and useful to better understand the code.
Great guide. Not what my teachers in college would recommend but makes more sense and less headache to the coder. Also AFAIR Google’s python style guide also prescribes 4 spaces. Could you link to the Google style guide you are referencing to? Seconding automatic formatter instead of automatic linter.
Thank you for the correction!
You’re right - the public guide does. Apparently the internal guide is 2 spaces https://developers.google.com/edu/python/introduction?csw=1
I have no research that I know off, but there is a “motif” that you want that all your code base is agnostic to the programmer, in other words you are unable to tell who coded which part… thought for that part I have a paper that says the coding style of specific programmers survives a lot… https://blog.acolyer.org/2018/03/16/when-coding-style-survives-compilation-de-anonymizing-programmers-from-executable-binaries/ (I wonder if we can trace back who is the real satoshi :P).
Thank you very much for the style guide. It definitely changes the way I think about coding for the rest of my life.
There is one thing not in the style guide yet that I hope you will address, which is the import strategy. I see that fastai exposes all the common libraries in the imports.py file, which we can use
from .imports import * to easily import everything. It is a great approach for building a jupyter notebook, since autocompletion can suggest all the common symbols for us. However, some of the fastai module import everything in that way also, such as in the model.py or metrics.py. This importing strategy introduces unnecessary dependencies for fastai modules. When ones want to use fastai as a library, they have to install all the packages that imports.py depends on (which is basically everything), even if the modules they use do not need these packages. For example, when I do structural prediction, I probably do not want all the image or text packages to be installed. The imports.py file also configures matplotlib, Jupyter, and other packages. This can be a trouble when deploying a fast.ai model. When deploying a model, we probably do not need Jupyter or matplotlib, but we still have to install them just to make the imports.py file run.
Probably something too tiny: but I think it helps when imports are arranged in some order. Could be an add on to the same line imports rule to also have them be in alphabetical order. (Or not.)