Build a dataset with Microsoft Azure

I dont have access to Microsoft Azure and tried following multiple threads from 2019 to build a dataset.
Tried most of the methods there but still stuck.
Is there any other way to build your own dataset?

Have you tried Getting The Bing Image Search key? These are the steps that worked for me.

If not, then it really depends on what data you want. You can take existing open source datasets like imagenet, or explore websites/apps that have good APIs like spotify, which has a lot of tabular data. If you want something more similar to bing search, you can look up “image search free api” or something along those lines.

If you want something less organized, look up web crawling, although you’ll have to be careful to keep it legal and not put too much stress on a website.

And, if you just want to get through the tutorial, feel free to download ~10 images of each bear by hand and recognize you’ll likely get poor accuracy.

I can’t make an account using debit cards cos they only accept credit cards, so that’s the problem.

Hi NoBlueWithoutYellow I hope you are having a wonderful day.

The simplest way to build a small dataset is to grab a some images from anywhere respecting copyright!
For example originally we used to grab images, from doing a google search or copy some images from any public opensource dataset.

Place them in some category named directories and there you have your dataset, you don’t need any tools.

The snippet of code can be used on Colab to replace the azure section.

from google.colab import drive
drive.mount('/content/drive')

artists_types = ['warhole','otto_dix','picasso', 'dali']
path = Path('artists')

if not path.exists():
    path.mkdir()
    for o in artists_types:
        dest = (path/o)
        print(dest)
        #dest.mkdir(exist_ok=True)
        #results = search_images_bing(key, f'{o} bear')
        #download_images(dest, urls=results.attrgot('content_url'))

for o in artists_types:
  dest = (path/o)
  print(dest)
  dest.mkdir(exist_ok=True)

path.ls()

# upload images from local machine to paths above

This will get you started while you investigate a way to download millions of images if required.

ps. I have a credit card but only use it to get air miles for traveling!

hope this helps

cheers mrfabulous1 :smiley: :smiley:

1 Like

I don’t quite understand how this works. :slightly_smiling_face:

Do I need to download images from google search and make my own dataset and use the above code to upload it ?

Definitely an issue! The website wasn’t working with me but there are options for 7 days that don’t require a credit card, or if you’re a student you can sign up through that. I ended up having to input my credit card after 30 minutes of yelling at the Azure website.

@NoBlueWithoutYellow
You are not required to use the code snippet to upload your images. You can do it manually in your Google Drive, just make sure you put each category in a separate folder if you don’t want to modify the code to create the datablock. I saved a few photos of my dogs and put it in Google Drive folder. I built classifier to distinguish between the dogs. It’s for practice, so whatever dataset you think is interesting is fine.

3 Likes

I’ve been stuck on this issue for more than three days and feel have wasted quite a bit of time. I can’t progress further in the lectures because of this.
So if you don’t mind, can you please share your code.
I really don’t want to spend any more time on this issue now.

How to create a dataset.

thank you soooooo much.
I never thought of using the 7 day free trial of Azure services that does not use a credit card.

Even though, this is a temporary fix but at least it will get me started again.
Really saved a lot of time i was gonna spend finding a solution.