I am not able to find search_images_ddg api on the notebook. Is this api available on the fastbook chapt2 notebook?
Resolved : If people are using the template on paperspace, the fastbook version that is available as part of the template is 0.0.14. The new image search api, search_images_ddg is available in later versions. We need to upgrade the fastbook version. You can print your fastbook version using
import fastbook
fastbook.__version__
0.0.18
The latest version as of 21st Aug is 0.0.18. search_image_ddg is available in this version
Here is an alternative to bing image search using the duck duck go api.
from duckduckgo_images_api import search
import warnings
warnings.filterwarnings('ignore', message='Unverified HTTPS request')
import requests
def ddgo_download_images(search_term = None, dest=None, max_results=None):
search_results = [r['image'] for r in search(search_term, max_results=max_results)['results']]
image_output_name = [search_term.replace(' ','_') + '_' + str(idx) + '.jpg' for idx in range(len(search_results))]
for url, image_name in zip(search_results, image_output_name):
try:
# Skipping SSLCertVerificationError
r = requests.get(url, verify=False)
with open(dest/image_name, 'wb') as outfile:
outfile.write(r.content)
except:
pass
path = Path('penguins')
search_terms = ['emperor penguin', 'king penguin']
if not path.exists():
path.mkdir()
for search_term in search_terms:
dest = (path/search_term)
dest.mkdir(exist_ok=True)
ddgo_download_images(search_term=search_term, dest=dest, max_results=10)
just wanted to reply to direct people to a solution that solved the issue at the current time, Sept 18, 2021. I was using the default cognitive search, make sure you follow step #2 provided by PedroSousa and search for bing search v7
cheers
Thank you so much! It worked!
Had the same problem, so I made a web scrapper for myself and others with this in mind.
No cards, signups or anything needed. Run the .py file from terminal.
If you don’t have chrome & chrome-driver + selenium, there is a how-to (again, terminal command lines to copy-pasta) on the github page.
thanks! to generate a key to access Bing Image Search following the instructions by PedroSousa worked for me on 19/10/21.
it took me a bit of looking around until I found it, so hopefully that will help someone
Wow, this helped me a lot, thank you!
Hi @thatgeeman
Thanks for the google search alternative and the detailed explanation on how to set it up.
However I am getting a '404 Client Error : Not Found for url’
When I tried the url https://googleapis.com/customsearch/v1 it indeed no longer exists and instead I ended up in https://developers.google.com/custom-search/v1
Did you face a similar issue ?
I found a solution to my problem.
In fact the url for search changed and instead of
https://googleapis.com/customsearch/v1
I used
https://customsearch.googleapis.com/customsearch/v1
I am really confused because I do agree with you about the documentation mentioned. But when I tried it on paperspace with the url specified (https://www.googleapis.com/customsearch/v1?) it gave me a 404 not found error.
That’s when I found this google page that helps you use the API whitout code, and you can see in the code part generated automatically that it uses https://customsearch.googleapis.com
Maybe I am doing something wrong here without realising it.
Do your code still work without encoutering errors ?
Yes, I tried with both the URIs on Kaggle and both return the same information in the same format.
After a bit of slow reading I see here, here and here that they have differentiated both the URIs and best practices. In brief, https://customsearch.googleapis.com/customsearch/v1
works better with “Google-provided client libraries” since it uses gRPC transcoding syntax and https://www.googleapis.com/customsearch/v1
for conventional RESTful requests. The query parameters remain the same. My understanding may be wrong (not an expert with these jargons) but this is what I was able to gather.
#https://pypi.org/project/bing-image-downloader/
import sys
!{sys.executable} -m pip install numpy
from bing_image_downloader import downloader
import os, shutil
from pathlib import Path
dataset_name='bears'
path=Path(dataset_name)
#creates the parent dataset folder
if not path.exists():
path.mkdir()
labels=['grizzly', 'black', 'teddy']
for l in labels:
downloader.download(query=f'{l} bear',limit=100,output_dir=path,adult_filter_off=True)
#changes the folder name from default to fastAI label specific
if not Path(f"{path}/{l}").exists():
os.rename(f"{path}/{l} bear", f"{path}/{l}")
#add code to handle folder management if code is run multiple times
Instruction
- Install the package (bing-image-downloader · PyPI)
- Import the library at the start of the notebook
- Replace the downloading images part of the code with the code provided here.
repo
Hope this helps.
Thanks for this solution! Was getting splits of only 2 folders before: grizzly and a combo of black bear and teddy. This solved it.
DuckDuckGo does not require a key
!pip install -Uqq duckduckgo_search
from duckduckgo_search import ddg_images
def search_images(term, max_images=200):
return L(ddg_images(term, max_results=max_images)).itemgot('image')
results = search_images('teddy bear')
Use the following syntax:
download_images(dest, urls=results)
Ref:
After more than an hour of struggling, this worked for me: 2022-12-14.
I don’t think that the keys can be found using this method anymore.