I am starting the course in April 2024. I am facing lots of errors due to deprecated/not working code in the kaggle notebooks. Also there are even may differences between the code mentioned in the beginning of the video - the presentation part before talking about how to use the notebooks - and the code in the notebooks on kaggle!
Somehow, I managed to get the code to work by mixing both the updated and non updated version. But I think it may be confusing, very difficult and disappointing for others!
So, I would like to contribute about that and I wonder is the the right place to report these issues in details.
/opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
self.pid = os.fork()
def search_img_urls(keywords, max_images=30):
print(f"Searching for '{keywords}'")
with DDGS() as ddgs:
search_results = ddgs.images(keywords=term, max_results=max_images)
image_urls = [result.get("image") for result in search_results]
return L(image_urls)
you wrote it slightly different: keywords=keywords not in the list of parameteres of the function ddgs.images()
The algorithm uses name of folders in which photos are stored and use these names as a labels for prediction.
If you download cars photo in the folder bird then the algortihm will write bird in the prediction.
[Question] The text justifies the use of the test data by stating that, âJust as the automatic training process is in danger of overfitting the training data, we are in danger of overfitting the validation data through human trial and error and explorationâ.
While I understand how a human could overfit hyper parameters for the validation set, I donât see how a test set solves this problem. If you train a model using the training data, validate it using the validation set, and then get poor test set results, wouldnât you just go back and change up the model so that you get better test set results? And thereby re-introduce the problem of human overfitting?
Hey all, Iâm getting a RateLimitException when downloading the images from duck duck go (see screenshot). Do we need to increase the sleep time in the program so it doesnât throttle the server?
Thanks in advance.
Iâve just started the course (absolute beginner) and ran into exactly the same problem. I see you posted this in March with no answer yet. Did you ever figure it out or are we just stopped dead in our tracks in cell 2 of lesson 1?
To answer my own question, I think that it is a good course, and worth doing. I love the âworking code first, then explanationâ model. The dependency errors are due to pythons pathetic version management, which I guess isnât the course creators fault. Until python library developers adopt, and stick to, some sort of semantic versioning theyâre inevitable. Lately Iâve been just working through the book, handling the dependency problems as I strike them. They do seem to have reduced as I go through.
I have the same experience: ran the notebook as-is to train on bird and forest. The model seems to classify bird and forest photos really well. However, when asked to classify photos of something else (shark, ocean, sun, âŚ), it seems to always label them as âbirdâ.
I can see the model not being able to classify things it was not trained for but it seems odd to be biased toward âbirdâ and not âforestâ.
Were you able to run the original fastai notebook from the book? Assuming that you copied-pasted the code from that link, it looks like your code errored out at this line (searchObj is null).
My guess is that the request to duckduckgo does not return the same content as it was, so searching for vqd ... does not work anymore.
Iâve switched to using
from duckduckgo_search import DDGS
foo = DDGS().images(keyword)