In NYC they get 50-100 really amazing folks from major financial, Internet organizations to come together to work on interesting data sets with interesting government and non-profit organizations. I’ve participated in two of these events. I have found these very useful personally. And I feel like I have helped do some good for the non-profit organizations.
The Taproot foundation also provides pro bono opportunities. Not specifically related to data science or machine learning. However, I can imagine that they would find folks with skills in these areas useful.
The neural net is basically trying to improve its weights for better performance based on your metrics… A epoch is basically a number of times your net has seen the whole data…
Not sure about randomisation of the dataset but the data is splitted according to your batch size…It’s like how much images in one go is seen by the net…Just like a Pipeline simultaneously…
Couldn’t recognise the last plots?(it seems that the images are zoomed in to get pixels?)
Just one question…
Does the order of images in the dataset matters?(provided they have equal distribution in terms of counts…)
What is the difference between the two variables sz and bs? bs I understand is batch size. What is sz and how does it affect the model? How does bs affect the model?
sz determines the dimensions (height x width) of your input image. Having a smaller image helps speed up the training process. The number of Convolution operations are significatly reduced with a smaller input image sizes. Since most of the network is Conv Layers you could see significant boost in training performance. In the course, Jeremy suggests to start with a small sz parameter, train quickly to a reasonable weights value then increase the sz parameter (as powers of 2) until the original dimensions of the image.
Batch bs parameter specifies the number of images considered in each iteration (mini-batch run). You want as large as a batch size as possible so that gradient updates are more accurate. So the rule of thumb many follow is to have as large of a batch size as it will fit in GPU Memory. Having said that, having a smaller bs is not a bad approach so that updates happen more often and many times in one epoch so there may be chance to train faster.
One last question… when reduce sz, does it compress the size of the image in the batch/epoch or is it taking only those images that fit the sz dimensions. Thanks for your help so far
Hi - I have setup a GCP vm instance for the course. I am trying to run the image classifier with my own data set (which Jeremy talks about around minute 30 for which i have downloaded the some images from google). How do i transfer these files from my local directory to the gcp instance?
Just finished with the code for the first weeks lesson.
Regarding the cycle_multi parameter in the fit function, does it increase the number of iterations step by step until the best accuracy has been achieved for the given number of cycles.
have also started to participate in a challenge in the iceberg classification challenge.People have said that it’s a very difficult challenge. it that so?
Thanks finally got it to work. Apparently if int he destination if I give instancename:~/destination the files dont show up but if
I give instancename:/full destination path then it works
I have build my own machine running on Ubuntu 16.04.
I was wondering about the best way to set it up on the software side so that I can work on the v2 of the course.
Would the paperspace setup work for my machine?