Problem / inconsistency with fastchan

dspyrhsu · October 25, 2022, 2:30pm

Hi there,

let me first of all say how excited I am to be here and I hope I will be able to make some sensible contributions at some point. Here, I wanted to point out a bit of an annoying fact and maybe there is some maintainer of the fastchan conda channel who can do something about it.
The issue is that by default, when installing pytorch (or fastai) via fastchan, I will end up with the cpu-only version of pytorch, while, when using the pytorch channel, I will get the cuda version (as expected).
Now for the long version:

So, here is the situation: on my laptop, I installed and use the nvidia driver in Ubuntu 20.4 focal (but this also happens on the other one running 22.04 jammy). Now, when I install fastai as per the recommendation, namely using

mamba install -c fastchan fastai

I unfortunately get the cpu only version of pytorch. Sure, I can remedy this by explicitly specifying the cuda version I want after lloking at the files in the fastchan repo, but that should not be necessary. And it isn’t, since when I use the pytorch channel, the appropriate version is installed. In order to narrow this down, please compare the following two commands:

mamba install -c fastchan pytorch

which will wants to do the following:

  Package               Version  Build                Channel                  Size
─────────────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────────────

  + cffi                 1.15.1  py310h74dc2b5_0      pkgs/main/linux-64      416kB
  + intel-openmp       2022.1.0  h9e868ea_3769        pkgs/main/linux-64        5MB
  + libblas               3.9.0  14_linux64_openblas  fastchan/linux-64        13kB
  + libcblas              3.9.0  14_linux64_openblas  fastchan/linux-64        13kB
  + libgfortran-ng       12.2.0  h69a702a_18          fastchan/linux-64        22kB
  + libgfortran5         12.2.0  h337968e_18          fastchan/linux-64         2MB
  + liblapack             3.9.0  14_linux64_openblas  fastchan/linux-64        13kB
  + libopenblas          0.3.20  h043d6bf_1           pkgs/main/linux-64        6MB
  + libprotobuf          3.20.1  h4ff587b_0           pkgs/main/linux-64     Cached
  + mkl                2022.1.0  hc2b9512_224         pkgs/main/linux-64      136MB
  + ninja                1.11.0  h924138e_0           fastchan/linux-64         3MB
  + numpy                1.23.4  py310h53a5b5f_0      fastchan/linux-64         7MB
  + pycparser              2.21  pyhd8ed1ab_0         fastchan/noarch         103kB
  + python_abi             3.10  2_cp310              fastchan/linux-64         4kB
  + pytorch              1.12.1  cpu_py310h75c9ab6_0  fastchan/linux-64        65MB
  + sleef                 3.5.1  h9b69904_2           fastchan/linux-64         2MB
  + typing_extensions     4.4.0  pyha770c72_0         fastchan/noarch          30kB

  Summary:

  Install: 17 packages

  Total download: 226MB

On the other hand, if I say

mamba install -c pytorch pytorch

I get what I would like to get:

  Package               Version  Build                         Channel                  Size
──────────────────────────────────────────────────────────────────────────────────────────────
  Install:
──────────────────────────────────────────────────────────────────────────────────────────────

  + blas                    1.0  mkl                           pkgs/main/linux-64        6kB
  + cudatoolkit          11.3.1  h2bc3f7f_2                    pkgs/main/linux-64     Cached
  + intel-openmp       2022.1.0  h9e868ea_3769                 pkgs/main/linux-64        5MB
  + mkl                2022.1.0  hc2b9512_224                  pkgs/main/linux-64      136MB
  + pytorch              1.12.1  py3.10_cuda11.3_cudnn8.3.2_0  pytorch/linux-64          1GB
  + pytorch-mutex           1.0  cuda                          pytorch/noarch         Cached
  + typing_extensions     4.3.0  py310h06a4308_0               pkgs/main/linux-64       43kB

  Summary:

  Install: 7 packages

  Total download: 1GB

So, maybe the fastchan maintainer could have a look at this issue, but maybe it isn’t one in the first place because there is some intention behind it. Personally, I believe that people with a GPU will want to see it used for model training, so I believe that should be the default behavior.

Thanks to everyone who made it here and who is willing to think about this!

AllenK · October 26, 2022, 8:43am

Generally I get PyTorch separately using the current instructions on the Pytorch.org site to ensure I get the latest specific version that matches my card.

(Sometimes I do that before the fastai install, sometimes after, if there are problems.)

it would be nice if it worked via mamba fastchan reliably, (mamba used to fail on the accepting the cuda license on my machine where as conda would be fine, )

But also nice to know it is easy to work around when it doesn’t, by installing separately

dspyrhsu · October 26, 2022, 9:58am

Hi Allen,

thanks for the reply and that would have been my way of doing things as well, now that I know the pytorch channel does what I expect / want.

I just thought I’d mention this issue, since it somehow defies logic to install the cpu-only version when a cuda version is ready to be installed on a system and using the pytorch channel will do just that (so it is possible to check for it, apparently), while fastchan doesn’t.

guillaumeramelet · January 13, 2023, 10:27am

Hello Reinhold

I have just realized that I can get different results if I run conda or mamba

Here is the detail:
Starting from a complete new environnement created with mamba create -n fastai_v2 python=3.9

If I run conda install -c fastchan fastai

Here are some packages to be installed

  cudatoolkit        fastchan/linux-64::cudatoolkit-11.6.0-hecad31d_10 
  fastai             fastchan/noarch::fastai-2.7.10-py_0 
  pytorch            fastchan/linux-64::pytorch-1.12.1-py3.9_cuda11.6_cudnn8.3.2_0

but if I run mamba install -c fastchan fastai
Here are some packages to be installed

  + fastai                    2.7.10  py_0                fastchan/noarch        Cached
  + pytorch                   1.13.1  py3.9_cpu_0         fastchan/linux-64      Cached

Long story short, conda installs GPU version and mamba CPU version.

And I have no idea why?

It is different because you compared pytorch and fastchan channels. But maybe it ould help to understand?

guillaumeramelet · February 6, 2023, 3:45pm

I have restarted my config by following live coding sessions session 1.
Installing mambaforge solved everything for me