Use `pip install -e .` when developing fastai_v1

jeremy · September 23, 2018, 1:32am

OK now that we’re not using dev_nb any more, we need to modify modules in fastai directly. Let’s not create symlinks anymore like we did in 0.7, but instead we’ll install a proper “editable install”. To do so, simply (from the fastai_v1 base dir):

pip install -e .

You should remove your old fastai 0.7 first, or create a new environment.

This command installs a link in your site-packages directory to your cloned git repo, so edits to your fastai_v1 code will be usable anywhere in your environment. Let me know if you have any questions or issues.

TheShadow29 · September 23, 2018, 6:14pm

So if I understand correctly, if I had added the previous fastai repository to my $PYTHON_PATH variable, I should remove that. Use pip install -e . in the fastai_v1 cloned repository, perhaps in a new environment. This will allow me to do import fastai from any directory. Is this correct?

Edit: Just tested it. This is correct. Also if one wishes to use the old fastai in parallel with the new one, it is easy to handle by creating a new environment.

stas · September 24, 2018, 1:59am

in setup.py:

url='https://github.com/fastai/fastai',

should this be fastai_v1?

jeremy · September 24, 2018, 2:56am

I’m not sure. Probably not, but I’m quite undecided. I think we should probably rename the old repo - but we’ll need to update forum threads etc if we do. But if we don’t, then we’ll forever have a crappy repo name…

stas · September 24, 2018, 3:55am

+1 to make fastai/fastai the main repo.

Now is a good time since any forks will need to be re-made anyway.

On forums you can probably easily rewrite those links, first doing the fastai->fastai_v0, and then fastai_v1 -> fastai.

The only problem is any other links on the internet. Too bad github doesn’t provide .htaccess-like mod_rewrite support.

Let’s do it!

update: it looks like I was wrong and github does support redirects, except I’m not sure how they will handle redirects in this case since we are re-using the old name: fastai->fastai_v0, fastai_v1 -> fastai, it may create a big mess. we shall see.

jeremy · September 24, 2018, 4:56am

I may have a better idea. fastai_pytorch repo

That way we can later have fastai_js, fastai_swift, and so forth!..

stas · September 24, 2018, 5:36am

I’m not sure I’m understanding this suggestion, Jeremy. What’s “editable install”?

Are you suggesting for developers to re-run the install every time any fastai .py file is modified?
Or you are suggesting to edit the .py files inside site-packages? But that’s not under git…

Symlink is the only way we can continue developing and running the updated code at once if we are talking about the notebooks under the git repo. Unless of course you mean something else.

jeremy · September 24, 2018, 3:38pm

If you just do what’s shown in the top post, it’ll all work just nicely for you

stas · September 24, 2018, 3:56pm

Thank you, Jeremy.

This won’t work for me as I use branches. So I will stick to symlinks.

jeremy · September 24, 2018, 3:58pm

Not sure what you mean - what problems are you having with editable install with branches?

stas · September 24, 2018, 4:00pm

If you use several versions of fastai code base, how can you tell python which one to choose using the editable install?

jeremy · September 24, 2018, 4:01pm

If you use the regular approach for switching branches in git, it’ll link up fine.

However if you have multiple branches in separate directories instead, then yes you’ll need a symlink or similar.

stas · September 24, 2018, 4:03pm

This.

stas · September 25, 2018, 4:17am

I’ve been sitting on it, and somehow it doesn’t sit well with me.

Are you thinking to have the namespace to match the package name? Won’t it be odd to have:

import fastai

but when a user will go searching for the fastai package they won’t find one. Is fastai_pytorch the same? or perhaps it’s something else.

I’m sure there are plenty of examples where this is the case (package and module names don’t match), but it just doesn’t feel good to me, since as a user of such modules I always find these mismatches cumbersome. Of course, this is just my opinion.

My vote would be to stick with fastai for both, the package and the namespace, despite the potential large number of 404 links that would result from renaming to what now is v0.

Too bad the internet browsers haven’t deployed waybackmachine.org for 404 links at their core, if it did we could have just asked archive.org to archive all the current GitHub - fastai/fastai: The fastai deep learning library and get users to find the data still after the rename. I use a firefox plugin that does that for me automatically and it’s super handy (except not all data is archived), so I make a point of telling archive.org to make a snapshot when I find something worthy of finding in the future.

Just thinking aloud…

jeremy · September 25, 2018, 4:19am

After thinking about it more, I don’t think this is really acceptable. I don’t think it matters whether repo name and package name match. Indeed, to support multiple languages and/or backends it’s inevitable, unless the package names redundantly include the language name, which we certainly wouldn’t want!

stas · September 25, 2018, 4:42am

I don’t completely follow the logic of the last sentence. The way I understood it, is that you’re proposing:

repo name         package name   module name
---------------------------------------
fastai_pytorch    fastai         fastai
fastai_tensorflow fastai         fastai
fastai_cntk       fastai         fastai

If yes, I can’t see how that would work.

Unless you meant something else in the last sentence. For me visual examples are always better than words
And perhaps let’s stick for just python front-end for now, leaving other languages for later.

jeremy · September 25, 2018, 4:44am

That’s true. I was thinking of fastai_swift and fastai_js. I’m not quite sure how different backends within the same language will work, or if they would have different repos. I’d guess they’ll share a repo, because I’d hope they’d share a lot of code. We’ll see…

stas · September 25, 2018, 4:47am

If the foundation of the current fastai incarnation is torch, how do you imagine using in parallel tensorflow for example - unless you add an abstraction layer which would probably make things very complex internally, no?

p.s. And please let me know at any point if you feel this discussion is not very useful, I’m happy to stop the brainstorming at any time. I’m also hoping that other developers would share their insights.

jeremy · September 25, 2018, 5:39am

I’m definitely not going to break links to the previous material. So solutions within that constraint are welcome. I think fastai_pytorch is fine, but I’m not wedded to it.

jeremy · September 25, 2018, 5:44am

I was rather hoping to steal or port pytorch’s DataLoader, and that most if not all of the tensor ops will work equally well with either backend (they are both numpy compatible slicing and operations), which would cover quite a bit of stuff already. But I don’t think we’ll know whether that will work in practice until if/when we try it.