Platform: Salamander ✅

@jannster3000 each notebook has it’s own hidden terminal session running iPython - selecting an environment in one session won’t change it in the others, they’re totally independent. By default fastai is not installed in the base environment. show_install() is provided by the fastai library, right? I’m not sure why it’s coming up with a different version to conda list. You could take a look at the source code on GitHub to figure out what it’s doing if you fancy [link].

This will give you the fastai version:

import fastai
fastai.show_install(0)

image

2 Likes

I’m having issues updating fastai to the latest release on a K80 instance. Specifically I’m trying to get the latest version going to use ImageFileList but I can’t seem to get it working.

So far I have:
Run git pull on the course v3 notebooks
Run git pull on the fastai folder
Run conda update fastai with the fastai environment activated

version.py in the fastai folder shows version 1.0.20.dev0 installed, but show_install still returns 1.0.11 as the notebook version.

Also twice today I have had my server crash on me, giving an Invalid response: 503 Service Unavailable error. This has not happened to me before with Salamander. Unsure if there is any relation.

Is there something I’m forgetting to do?

@KarlH wrt the 503 error it sounds like something funky might be going on with Jupyter Notebook, i’m not sure how the commands you mentioned would break it though. to take a closer look you could try connecting to your server and running tmux attach-session -t jupyter & reading the output; you may want to try reinstalling Jupyter. fyi here’s the command Salamander uses to start Jupyter (runs both Jupyter Notebook & Jupyter Lab in the same session):

sudo su ubuntu <<EOF
cd ~
tmux start-server
tmux new-session -d -s jupyter
tmux send-keys -t jupyter 'jupyter lab --NotebookApp.token={{token}}' C-m
EOF

The commands you ran look perfect. I suspect your notebook may have cached an older version of fastai; putting %autoreload 2 at the top of your notebook should be sufficient, but maybe restart the kernel for good luck?

Update on this. The solution was to run conda install -c fastai fastai instead of conda update fastai.

I’ve been stuck on status “Offline - connecting server to internet” for more than 12 hours and it seems to be chewing through my credits. Have removed my payment information just in case. Any quick fixes? @ashtonsix

@Nick, when servers get stuck you need to let me know & I will manually unstick your server (sorry!) - of course I’ll refund all wrongful charges.

I’ve fixed 90% of the errors related to servers getting stuck & am working on a solution to recover from unknown errors too - quick fixes are tricky as there’s a risk of data loss if you make any mistakes.

I was happily using Salamander last night but today when I start it up and click on the Jupyter link it just does the 60 second count down then closes the tab. The instance has been up for several minutes and I can SSH in without issue. I’ve restarted the instance a few times but no dice. What’s going on?

EDIT: the problem was that I changed my default shell to fish. Changing back to bash and restarting fixed the issue.

@tamlyn to start jupyter, the system ssh’s in as the user “salamander”. perhaps you can change the shell for just the “ubuntu” user?

I haven’t managed to register my AWS coupon on Salamander, yet, despite trying many times, at different times of the day, etc. I always see the response shown below.

Is that expected? If so, I am happy to keep trying, but am wondering…

I was trying to install open cv on my fastai environment using conda but am getting error that there is no room on device.

I don’t think I have used the 40gb on my server. Can someone tell me how I can fix this? Also how I can see the storage left on the server. Thanks!

I’m having some issues running notebooks overnight. Currently I’m trying to train a language model which takes about 12 hours.

One issue is that when my display turns off, the jupyter notebook stops updating, which poses problems for knowing how well the model is fitting. It also looks like queued notebook cells don’t run after a long training period.

Last night I set the model to train. This morning I saw GPU usage drop from 100% to 0% at around the expected time. I assume this means training is complete (I don’t know for sure since the notebook stopped updating around the 4th epoch, but the GPU has been idle for some time now). However the queued cell for saving the model has not run. Trying to run learn.save again doesn’t do anything. The whole notebook seems stuck and I’m not able to save the model.

Are there tricks for working with jupyter notebooks in this sort of situation?

Hi @ashtonsix I have been facing issues with the platform over the last few weeks - as soon as I click on ‘Jupyter Notebook’, an additional tab with a 60-second timer pops up and closes automatically after 60 seconds. Can you please suggest any quick fixes - I was charged this month and it would be really great if I’m able to use the remaining compute credits.

I am having some difficulty installing fastai and pytorch on the same environment. When I install torch in the fastai env, I get torch cannot load container_abcs. I read this is beacuse it is installed in some other directory or something. When I install fastai in torch env, functions simply dont work as in ‘create_cnn not found’.

I’m not sure about the version mismatches if any. Is there a clean way to have both in an environment?

Apologies for the long delay replying. I was writing a book

@sandmann @ricknta:

If you’d still like to redeem your AWS coupons on Salamander I spent the last 3 days rewriting salamander.ai/redeem-aws-coupon. This form has to access AWS via headless web browser, because AWS doesn’t provide programmatic controls for adding coupons which makes building reliabile software tricky.

The first version didn’t work very well: when AWS automatically logs you out, you may arrive on 1 of 3 slightly different login forms, the first version could only understand one of those variations and got stuck. This new version is much better at logging in, parsing errors, and doing things quickly / reliably; but to test it “all the way through” I need more valid coupons. It should work now, but I cannot 100% confirm. Give it a go, and if it doesn’t work, feel free to email your coupons to me and I’ll add them manually (also, the test will help me make the form more reliable).

@KarlH:

wrt Jupytre Notebook disconnecting: after refreshing the page, the code keeps running & updating variables but you won’t see any visual updates. Details here.

I just created a support page on Salamander with answers to common questions like this one btw.

@Soumanta:

I’m going to address the Jupyter quicklaunch reliability soon, in the meantime you can ssh into your server, and run jupyter notebook manually. Copy the token and add it to the end of your server’s IP copied from salamander.ai (the one printed in the console is wrong, that’s the server’s private IP)

@bluesky314

We sorted the disk space issue out over email, right? wrt package installation I’m not totally sure what container_abcs is & haven’t looked at the package versions or environments for a few weeks. If you’d like to set things up differently from the default, you can create a new server without any of the software options selected: that way you’ll have a blank canvas and it may be easier to configure?

Hey,
I’m a media-design student from Germany, I came across this model from a text that was mentioning salamander server service. I basically want to run this code on a personal dataset. Problem is, as I said, I’m not a computer scientist nor a proper coder, and my understanding of this is fairly limited (had just some experience fiddling around python models for MIDI generation, but I’ve no formal training in any of this) and I’ve 0 experience with services like salamander and jupyter.
I just wanted to ask if this is really doable, expansive (as a student I’ve some credits from AWS, don’t know how much, I’ve already set up an account on salamander and forwarded a request to github) or too complex for a novice.
If there is somebody kind enough to help me get my head around this I would really appreciate, even pointing to documentation or tutorials is helpful!
Thanks

@droeg it’s totally doable! but this thread isn’t the place to get help with training your model. https://course.fast.ai is the best place to get started

Hi @ashtonsix, I wanted to stop my Salamander server running but it has been marked as “Offline - moving storage” for some hours with all the buttons disabled. I could not see a direct help link on the site, don’t know if there is a better way to ask you to look at this for me, thanks. My compute credits are still dropping.

No reply here or to a direct message and my compute credits are still running down so I have removed my credit card, created a new account and transferred my credits to that new account. This seemed to be the only way to stop this constant charging.

Still no response so I deleted my account.