Are you sure? the documentation says otherwise.
I am really confused because I do agree with you about the documentation mentioned. But when I tried it on paperspace with the url specified (https://www.googleapis.com/customsearch/v1?) it gave me a 404 not found error.
Thatâs when I found this google page that helps you use the API whitout code, and you can see in the code part generated automatically that it uses https://customsearch.googleapis.com
Maybe I am doing something wrong here without realising it.
Do your code still work without encoutering errors ?
Yes, I tried with both the URIs on Kaggle and both return the same information in the same format.
After a bit of slow reading I see here, here and here that they have differentiated both the URIs and best practices. In brief, https://customsearch.googleapis.com/customsearch/v1
works better with âGoogle-provided client librariesâ since it uses gRPC transcoding syntax and https://www.googleapis.com/customsearch/v1
for conventional RESTful requests. The query parameters remain the same. My understanding may be wrong (not an expert with these jargons) but this is what I was able to gather.
@ maxk itâs also worked for me
#https://pypi.org/project/bing-image-downloader/
import sys
!{sys.executable} -m pip install numpy
from bing_image_downloader import downloader
import os, shutil
from pathlib import Path
dataset_name='bears'
path=Path(dataset_name)
#creates the parent dataset folder
if not path.exists():
path.mkdir()
labels=['grizzly', 'black', 'teddy']
for l in labels:
downloader.download(query=f'{l} bear',limit=100,output_dir=path,adult_filter_off=True)
#changes the folder name from default to fastAI label specific
if not Path(f"{path}/{l}").exists():
os.rename(f"{path}/{l} bear", f"{path}/{l}")
#add code to handle folder management if code is run multiple times
Instruction
- Install the package (bing-image-downloader ¡ PyPI)
- Import the library at the start of the notebook
- Replace the downloading images part of the code with the code provided here.
repo
Hope this helps.
Thanks for this solution! Was getting splits of only 2 folders before: grizzly and a combo of black bear and teddy. This solved it.
DuckDuckGo does not require a key
!pip install -Uqq duckduckgo_search
from duckduckgo_search import ddg_images
def search_images(term, max_images=200):
return L(ddg_images(term, max_results=max_images)).itemgot('image')
results = search_images('teddy bear')
Use the following syntax:
download_images(dest, urls=results)
Ref:
After more than an hour of struggling, this worked for me: 2022-12-14.
I donât think that the keys can be found using this method anymore.
You can also use this Google Images API for scraping Images.
Hi guys, for those who are advocating the use of the Azure API, as of July 2023, you must utilize the following code line in order to access the API Key:
key = os.environ.get('BING_SEARCH_V7_SUBSCRIPTION_KEY', 'XXXX')
Itâs no longer possible to have a Bing Search API key for free now:
"
Thanks for reaching out to us. We are evaluating our sign-up process for Bing APIs. New customers will be unable to add a Bing resource to their subscriptions. Exceptions or ETA are not available. If you have additional questions, please contact BingAPIMS@Microsoft.com.
"
ChatGPT helped me to by-pass with the following:
from duckduckgo_search import DDGS
def search_images(term, max_images=150):
print(f"Searching for '{term}'")
with DDGS() as ddgs:
return [result['image'] for result in ddgs.images(term, max_results=max_images)]
from fastdownload import download_url
bear_types = ['grizzly', 'black', 'teddy']
path = Path('bears')
path.mkdir(exist_ok=True)
for bear in bear_types:
dest = path / bear
dest.mkdir(exist_ok=True)
print(f"Downloading images for '{bear}' bear...")
urls = search_images(f'{bear} bear', max_images=150)
download_images(dest, urls=urls)
failed = verify_images(dest.ls())
if failed:
print(f"Removing {len(failed)} failed images for '{bear}' bear.")
failed.map(Path.unlink)
print("Download complete!")
Post signing up for free on Azure, Microsoft is not allowing me to create BingSearchResource for free and throwing the following error. It is extremely time consuming to work with Bing image search, they just donât approve access to the free-tier. We should think about updating the lesson
Your account is not approved for this resource. Contact BingAPIMS@microsoft.com with any questions.
(Code: ApiSetDisabledForCreation)
Using search_images_ddg as outlined above
Ended up wasting quite a bit of time to get the bing-image-search to work
Thanks for sharing! This definitely saved me a lot of time.
Leaving a note for others. Youâll need to pip install duckduckgo_search
And here is an LLM modified version that slows down the rate; I ran into a RateLimit error from DuckDuckGo
import time
from pathlib import Path
from duckduckgo_search import DDGS
from fastdownload import download_url
from fastai.vision.all import download_images, verify_images # Assuming you're using fastai
def search_images(term, max_images=150, delay=1.0):
print(f"Searching for '{term}'")
results = []
with DDGS() as ddgs:
for result in ddgs.images(term, max_results=max_images):
results.append(result['image'])
time.sleep(delay / 10) # small delay between individual results to be polite
return results
bear_types = ['grizzly', 'black', 'teddy']
path = Path('bears')
path.mkdir(exist_ok=True)
for bear in bear_types:
dest = path / bear
dest.mkdir(exist_ok=True)
print(f"Downloading images for '{bear}' bear...")
urls = search_images(f'{bear} bear', max_images=150, delay=1.0)
time.sleep(2) # delay before downloading
download_images(dest, urls=urls)
time.sleep(2) # delay before verification
failed = verify_images(dest.ls())
if failed:
print(f"Removing {len(failed)} failed images for '{bear}' bear.")
failed.map(Path.unlink)
time.sleep(5) # delay between bear types to reduce risk of hitting rate limits
print("Download complete!")