@txuninho - I have not had CUDA memory errors in the use of ai_utilities. Moreover, I don’t believe it uses the GPUs. If lowering your batch size doesn’t help, try restarting Jupiter Server. If that doesn’t help, please provide me sample code and error output.
I am using google collab and I wanted to know if we can run this script in Google Collab notebook or not? If you used Anaconda on local machine how would you train large number of images without GPU?
Item no.: 1 --> Item name = baseball game
Evaluating…
Looks like we cannot locate the path the ‘chromedriver’ (use the ‘–chromedriver’ argument to specify the path to the executable.) or google chrome browser is not installed on your machine (exception: Message: ‘chromedriver’ executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home
)
UnsatisfiableError: The following specifications were found to be in conflict:
icrawler -> python[version=’>=3.6,<3.7.0a0’]
python=3.7
Use “conda search --info” to see the dependencies for each package.
when I attempted conda install -c hellock icrawler.
Hello! I just started the 2019 course and finished watching the Lesson 1 video. I’m excited to try to build my own dataset and train the deep learning classifier on this dataset.
One question I couldn’t find an answer for from a cursory search of this forum was guidelines for the size of the data set.
Not sure if you ever resolved this.
I finally figured out that the conda install -c hellock icrawler was trying to install in the base conda environment which was python 3.7. fastai was using 3.6.
So if the specific conda fastai environment is say myFastai
then you have to use
conda install --name myFastai -c hellock icrawler
I created a clone and installed it there.
conda create --clone fastai-3.6 --prefix $CONDAFI_PATH/fastai3.6
conda activate $CONDAFI_PATH/fastai3.6
conda install --prefix $CONDAFI_PATH/fastai3.6 -c hellock icrawler
pip install python-magic
This makes me nervous, anything on Facebook actually makes me nervous hah. But on that note, be mindful of permissions when getting images from Facebook. You can read here - https://developers.facebook.com/docs/graph-api/reference/photo/#permissions. For example, you can only get user photos if when they authenticate with your app they have granted you said permission.
I was able to download the file to my laptop but cannot access it from the jupyternotebook on GCP (I guess if I run a web server on my laptop and share this dataset that way, it’ll likely work, but I wonder if there is another way).
From what I heard/read so far, it depends. The example Jeremy gave in the lesson 1 video of baseball vs. cricket has 30 images, IIRC. I guess if there are more categories, you will need more images (so every category has at least several); if the risk of overfitting is higher, e.g. images with the same label have same visual features by accident, perhaps more images will help.
Hi @lindyrock i am not able to download more than 100 images at a time even after i have downloaded the chromedriver. I am using Google Colab as my Jupyter Environment.
Feel free to use and alter, as you’ll see the keywords and time_ranges have been hardcoded to loop over. I’ll be adding more helper scripts to that repository as I go along the 2019 course so feel free to star/watch the repository (or contribute if the inspiration strikes you).
This is designed to run on a linux VM (I run mine on GCP, but I’m sure it should work elsewhere that has chromedriver and has python running in a virtualenv)
Lindy/All:
Working on p2.xlarge instance on EC2 service by AWS, and stuck on ‘chromedriver’?!
I’ve installed: (a) Google images download ; (b) chromedriver
using your steps (1) and in:
Moving ‘chromedriver’ to /usr/bin/chromedriver
Unfortunately, ‘chromedriver’ path can’t be located:
Item no.: 1 --> Item name = shadow on highway and freeway Evaluating… Looks like we cannot locate the path the ‘chromedriver’ (use the ‘–chromedriver’ argument to specify the path to the executable.) or google chrome browser is not installed on your machine (exception: Message: unknown error: cannot find Chrome binary (Driver info: chromedriver=2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Linux 4.4.0-1099-aws x86_64) )
I get bunch of “command not found” messages, like:
Downloaded google-chrome-stable_current_x86_64.rpm bash: line 70: rpm: command not found Installing the required font dependencies. bash: line 76: yum: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting glibc… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting util-linux… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting libmount… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting libblkid… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting libuuid… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting libselinux… bash: line 107: rpm2cpio: command not found bash: line 100: repoquery: command not found http://: Invalid host name. Extracting pcre… bash: line 107: rpm2cpio: command not found Finding dependency for ldd.sh bash: line 176: repoquery: command not found Finding dependency for ldd.sh
I decided to classify mushrooms and found this website, where observers can upload photos of mushrooms along with the species they believe the mushrooms belong to (and maybe more/less specific denominations, e.g. infraspecific name/stirp). The community also contributes by stating their opinions based on the photos.
Well, anyways, the maintainers explicitly ask not to scrape the website but rather drop them an email. I did so and got a reply within 5 minutes or so. 30 min later I got access to 10⁶ mushroom figs. Great people.