Timisoara study group - fast.ai Live

Totally worth setting up GCP!
Advantages :

  • training can run 10x faster (on P100 VM at least)
  • after initial setup everything will work out of the box (In colab you’ll have to run the whole packages setup every time).

Also those 300$ in credits should go a long way (more than enough to finish this course. maybe enough to do the advanced one too).

I expect problems in future lessons on colab, but I do intend to try and run at least the basic notebooks in colab too.


BTW, there’s an official Colab notebook for lesson 1 too (see https://forums.fast.ai/t/lesson-1-official-resources-and-updates/27936 with Lesson 1 notebook for google colab ).

It does seem slow and for resnet34 I had to set bs=12 instead of bs=16 like they did, because I got a ‘bus error’ (ie. not enough GPU memory). For resnet50 they set bs=10 which works.

It feels to me that using resnet is a cheap trick. Your training is not any smarter/faster, you are just building upon a huge pre-trained NN which is awe-inspiring but you have no way to control.

1 Like

The point of using a big pretrained net is you get really fast to a very good baseline. That is the state of the art recommended approach for most vision problems today. From there it’s your choice if you dig deeper to understand what’s going on to make your models better. OTOH you can use this powerful model you’ve already trained for your problem to train smaller models much better (this knowledge distillation technique should be covered in future lessons, but you can start looking into it from this paper https://arxiv.org/abs/1503.02531 ).

If you look at the actual implementation of the training procedure, you’ll see there’s quite a bit of “smart” in the backend (e.g. : https://arxiv.org/abs/1803.09820 ).

You can get a better feel of how good the framework is if you try to match the accuracy obtained in fastai in another framework. Then compare how much time it took you to write the code to do all you need and how much time the training takes on similar HW.

I played around with a kaggle competition last night and using code based on the Lesson1 notebook I got to around 96% accuracy in 30-40 min ( dowloading dataset + coding + training). Will share once I clean it up. I have much more experience with keras & tf but it would’ve taken me at least as much time just to write the data preparation code. Maybe I will try to match the result in keras during the weekend just for comparison.

I have little experience in this area, but I dislike the fact that there is a fastapi python library too.

It’s easy to code a ‘happy path’ into your API and then the examples look really short and powerful. Of course, once you need to change anything you have to dig down and then you see if the API is really well thought of or just veneer.

Knowledge distillation sounds cool to me as it also hides the implementation detail. It’s a form of obfuscation. I remember reading that some Google guys showed you could make an equivalent single-layer NN for a multi-layer NN.

“It’s easy to Tweet when your engine does all the work.” (this comes from the chess grandmaster Sokolov)

I found very interesting the article you suggested ( https://arxiv.org/abs/1503.02531 ). I was guessing if you already tested to shrink a model training on a small deploy device. It happen to me often to load deep learning solutions on very small devices with only 4 or 8 cores and limited amount of memory and bandwidth. But I never attempted this approach…:upside_down_face:

1 Like

I just tried running ConvLearner on the “3”/“7” images dataset and Google Colab was about 7 times faster than my CPU :slight_smile: (I actually have Linux in VirtualBox and I only gave it 4 of the 8 i7 cores I have, but still).

So while Colab might be slow, it sure beats the CPU.

1 Like

It can work but it depends on how complex your problem is vs how big the network. No rule of thumb for it, you just have to try it :smiley:

For those of you that were involved in the discussion we had on 26.10 at Cowork,the article

The Building Blocks of Interpretability

has some powerful visualizations with whom you can interact that might help understand better what and how a network ‘sees’.

Hope it helps :slight_smile:


Feel free to check out the other articles on distill.pub. They are all of great quality.

I made a mindmap with what are considered to be the most useful rules of syntax for regular expressions.

If somebody needs the mindmap, I will share it somewhere with you.
You can modify its content as you desire by installing Xmind (Zen or 8) (I made this one in Zen and I recommend it as it is much faster to work with).


Regular expressions get complicated really fast. Note that ImageDataBunch.from_name_func allows you to set a label_func= argument which is a function you can define to label each image. I find it easier for me as instead of grouping with regexp I can just do str.rfind(".") and then extract substrings manually in Python.

For those who had errors on Friday during Google Cloud setup (either because of gcloud-cli, or other packages), below there is an alternative how to do the entire setup, from creating an instance to accessing/running jupyter notebook in the browser:

As far as I remember Fedora distros were plagued with errors and even one Ubuntu had trouble.

1 Like

Hey Teo,
I tried the tutorial you sent but I ran into problems: I can’t select the service field “until I upgrade my free trial account” - see attached.
PS: I’m on Fedora 27.
LE: I upgraded the account. Now I have to wait 1 day for Google’s reply. Thanks for the tutorial!

[Colab] be careful to this one: ConvLearner is now called create_cnn


Lesson 1 is fully mind-mapped

Enjoy :slight_smile:


Does anybody had the following issue when trying to run the instance?


Does anybody have also problems connecting to the GCP instance?

Today I’ve received all the time the following error (after gcloud compute ssh command) and only deleting/re-creating the instance solved the problem for a short time (after stopping the instance the problem arose again).

ssh: connect to host port 22: Resource temporarily unavailable
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].

Never mind, I think I have fixed it somehow, but do not know exactly what solution solved the problem.

As I had to permanently recreate instances I’ve compiled a bash script with command aliases, if it helps anyone:

export IMAGE_FAMILY="pytorch-1-0-cu92-experimental"
export ZONE="europe-west4-b"
export INSTANCE_NAME="fastai-standard-compute"
export INSTANCE_TYPE="n1-highmem-8"

alias gnew='gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-p4,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \

alias gdel='gcloud compute instances delete $INSTANCE_NAME --zone=$ZONE'

alias gopen='gcloud compute instances start $INSTANCE_NAME --zone=$ZONE \
        && sleep 5 && printf "\n------Update fast.ai course repo------\n" \
        && gcloud compute ssh jupyter@$INSTANCE_NAME --zone=$ZONE \
                --command="cd tutorials/fastai/course-v3 \
                        && git checkout . && git pull \
                        && printf \"\n--------Update fastai library---------\n\" \
                        && sudo /opt/anaconda3/bin/conda install -c fastai fastai" \
        && gcloud compute ssh jupyter@$INSTANCE_NAME \
                --zone=$ZONE -- -L 8080:localhost:8080'

alias gstop='gcloud compute instances stop $INSTANCE_NAME --zone=$ZONE'

The content above may be simply saved as a new file called .bash_aliases inside home directory and will be automatically loaded at log-in time by .bashrc under Ubuntu.

These 4 commands may be enough for working solely from terminal (i.e. independent of the GCP webapp).

1 Like

Hi, I have just launched a discussion thread on fastai study groups to gather feedback from organizers and participants to identify best practices and avoid some gaps.

More information in this post. Thank you if you can take a few minutes to participate in the discussion :slight_smile: