For those who run their own AI box, or want to

Hi.

I have an old GTX 1070 that I used for a year or so to mine crypto back in 2017. But what I always wanted was to train at least one machine learning model :grinning:

At the beginning of 2019 I followed like 3 or 4 lessons of fastai, but it was impossible for me to set up the library in Windows. I think there wasn’t WSL2 yet, so I simply switch to Colab.

Now, watching the lesson 2 I followed the instructions and the installation seemed to work.

  • Although I had a conda installation, mamba created another (base)
    $ mamba env list
    image

  • Listing the installed packages with $ mamba list I can see fastai 2.6.3 and pytorch 1.11.0.

  • I can see the GPU
    $ nvidia-smi
    image

  • If I call python in the terminal I can import torch and fastai:

  • But then, when running jupyter notebook, I cannot import torch or fastai
    ModuleNotFoundError: No module named 'torch'
    ModuleNotFoundError: No module named 'fastai'

  • The terminal looks like this:

I know I can continue with Colab or Kaggle or any other online option, but I’m asking for help because it seems that I’m close to finally train a model locally, am I?

Thanks a lot

1 Like

Have you activated the environment in which you installed fastai?

I.e. if you installed fastai in an environment called fastai2022, you should activate it with conda activate fastai2022. Then you can call the python interpreter and test it.

If you installed fastai in the base env, that’s generally a bad idea. Please create another environment for fastai. If that still doesn’t work, remove anaconda/miniconda altogether, then reinstall it, and leave the base env untouched.

You are :wink:

2 Likes

[not for complete beginners]
I want to mention another much less known (comparing to docker) project,
which simplified my life when I wanted to get the benefits of an isolated setup like in a docker, but still want to keep it more flexible - GitHub - apptainer/singularity: Singularity has been EOL'ed, see Apptainer (it has been rebranded to Apptainer recently (Introduction to Apptainer — Apptainer User Guide main documentation), but I started to use while it was “Singularity”).

There have been posts about it before (Fastai2 in Singularity Container), but it has been also evolved a lot since 2020, so I think worth mentioning it again.

It has been born as HPC containers solution and will take a couple of days learning curve (if you already familiar with a docker), but has a few nice properties which I found useful when you want to play with DL setups, including fastai:

  • Truly one single container file (if you want it to be 1 file of course) - copying and organizing different setups easier to me.
  • Better support for GPU’s and other ASICs (it is “closer” to PCI devices than docker)
  • You can work inside a container like inside a VM (interactive changes = persistent overlays) or/and use definition files. (can import docker containers as well)
  • Neat HPC features, MPI support and etc.
  • You can actually encrypt your container.

I found it convenient when I am still in experimenting mode (i.e. not ready to finalize a full definition file), but want to move containers easily between boxes or cloud instances or rollback quickly if an environment gets messed up.

3 Likes

That’s a fine option - but I don’t agree that installing in base env is a bad idea. In fact, I always install everything in the base env!

4 Likes

Very interesting - I’ve never heard of this before. Might you consider writing a post about what this is, and why it’s interesting? I’m sure many people would find that helpful (I know I would!)

5 Likes

Hello, I am trying to set up FastAI on my Linux workstation using the fastchan channel. Installation went smoothly but something went wrong when I tried to import fastai in Jupyter notebook.

As soon as I import fastai using from fastai.vision.all import *, my notebook kernel died. I tried importing fastai via ipython and I received a Segmentation fault (core dumped) error.

nvidia-smi worked okay, so I do not think there’s a problem with CUDA.

Do you know what I can do to fix this?

Thank you in advance.

It’s a bit tricky to debug without additional information. If you are in a hurry, just install docker, the nvidia container toolkit, and then install fastai inside a NGC container image (using the same installation instructions. I suggest miniconda and mamba).

1 Like

Hi @mdmanurung , I ended up doing something similar to what @balnazzar has recommended except I used a container published by papserspace. This container comes preinstalled with pytorch/fastai/fastbook but the fastai version is 2.6.0 so I upgraded it to ‘latest’ which currently is 2.6.3.

You’ll need to install:

  1. docker
  2. nvidia container toolkit
  3. paperspace container
  4. And you’ll need to play around with configuration a bit to map your fastbook folder inside your container (this is what I’ve done /devel/fastbook – which I git cloned from its repo – is mapped as /notebooks inside my container.)

This can be a faster way if you’re ok with docker containers. I’m no expert but I learned enough about it to get one going because it saves me the trouble of trying to install it directly on my linux machine which I always had issues with.

P.S. I downloaed their April 25 rc2 image but they have newer images available (which I’m assuming have the latest version of fastai etc.)

1 Like

Thank you for your suggestion. For future reference, what additional information would be helpful to include? I am a beginner so I am not sure what to provide other than what I wrote.

This is not a beginner question - setting up on your own workstation is an advanced topic. So I’ve moved this to the thread set up for this advanced discussion.

2 Likes

Definitely!

But you are definitely not a beginner :smiley:

I was giving that piece of advice in order to stay in line with conda’s docs:

When you begin using conda, you already have a default environment named base . You don’t want to put programs into your base environment, though. Create separate environments to keep your programs isolated from each other.

I don’t exactly know why they do recommend that, but one of the reasons could be that base’s /bin is always in the search path (so that you can call conda and other basic stuff from any env) and if one installs other programs into base, it will be hard to distinguish which one is being called (tip for beginners: use which).
One more issue could be that different pieces of software would bring with themselves different versions of the same pkg, etc…

2 Likes

Interesting stuff. This goes on to my list. Thanks for sharing !

I tried to do some experimentation with Shark, and it’s still rough. It requires torch-mlir, which in turn requires a nightly build of torch, currently at version 1.12. However, it also requires functorch, which requires PyTorch 1.11. The in-development version of functorch (0.2.0) will be aligned with version 1.12 of PyTorch, but using the main branch of functorch did not work for me.

I’m sure we could find a previous version of torch-mlir that works with PyTorch 1.11, and then Shark should work fine. However, model training requires some adaptation, and model inference does too. I have no idea if those adaptations could be automated for all the models in the fastai library, but this chart makes me doubt it. Also, note there are still no mentions of Metal compatibility in the shark-metal column.

My goal was to try to set up an environment in my M1 laptop for fastai learning and experimentation, which would be very convenient. But this still looks like something that would require a lot of effort and manual tinkering, completely defeating the purpose. Fortunately I have a Linux box that I can access from anywhere, so that’s what I’ll keep doing until things mature a bit.

I have a hunch that PyTorch support could be announced during WWDC in June. If so, it would be amazing if it supported both the GPU and the Neural Engine. In my experience, the Neural Engine is much faster than the GPU; at least for inference.

1 Like

Thanks a lot for your help.
I did what you said.

  • Uninstalled/installed mini conda.

  • Installed mamba.

  • Created an new environment:

  • $ mamba create -n fastaiv3 python=3.9.10

  • Activated the environment.

  • $ mamba install -c fastchan fastai

  • $ mamba install -c fastchan nbdev

  • In the terminal I can import torch and fastai (Now it shows python 3.9.12)

  • But from jupyter notebook same fail. Even though I checked I’m in the same environment as in the terminal

Isn’t that weird?

Maybe I should go with Docker. :person_shrugging:t2:

1 Like

Just out of curiosity, what does your sys.path look like in jupyter? And since it seems you’ll probably be throwing this install away anyway, are you able to do a “!pip install torch” from inside the jupyter notebook which cannot find your installed modules?

1 Like

Hi, thank you.

Here what you proposed:

1 Like

How did you launch your jupyter server? Did you enable your environment before doing so?

1 Like

That’s the important thing. You are almost there.

Activate the environment in which you can successfully import torch and fastai. Then type:

which jupyter

and report the result back (it has to be the jupyter from that specific environment, otherwise it won’t see torch/fastai).

1 Like

I find it odd that your sys.path inside Jupyter doesn’t seem to know anything about your conda env.

If you look at the path variable inside your WSL2 install, you may have a situation where you’re getting to the python 3.8 env before you ever get to the conda/python3.9 env ??

if that’s the case, switching those two around might help (or just get rid of any jupyter install in your base wsl2 ) it seems you have python 3.8 installed in WSL2 and then in conda env you have 3.9 installed.

What I find weird is that conda is supposed to isolate you from your local env so when you activate your conda env it should show you the python 3.9 verison but it doesn’t.

1 Like