Multi-label prediction with Planet Amazon dataset

UKamath7 · May 5, 2020, 3:01am

Since this was kaggle competition problem, I wanted to work in kaggle notebook itself. but that caused me problem, since I cant understand, the format of the dataset available.

df = pd.read_csv(path/'train_v2.csv') using this I got error. Error is ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine=‘python’.`

I think the problem is that train_v2.csv is present in directory called train_v2.csv and that is causing path problem. I have seen many kaggle kernels shared which seems to work seeing their output, but I am getting error and cant move forward.
so kindly help @init_27 @jeremy @Sylvain

JonathanSum · May 5, 2020, 8:54am

Hi, If you have problem following the original lesson3 [Planet Amazon dataset] notebook, I may be able to help you since I did not have an issue. If so, please share your notebook here. Moreover, if it is the original lesson3 notebookm, I suggest try to upload the datatset to somewhere, such as dropbox, to allow direct download because that can help me to help you or write a notebook that works for you. I guess your issue is just the file is at the wrong location.

UKamath7 · May 5, 2020, 11:38am

Did you try kaggle notebook, shared from fastai??.
Since its kaggle problem I want to work in kaggle notebook itself. Its not working in kaggle.
I tried this
https://www.kaggle.com/hortonhearsafoo/fast-ai-v3-lesson-3-planet

And didnt work. Can you help to fix that notebook??

init_27 · May 5, 2020, 1:20pm

Hey @UKamath7. Please be careful to follow the guidelines when posting errors/asking help. Read here: How to debug your code and ask for help with fastai v2

I think it might be due to a version conflict by kaggle as the docker image isn’t always at the latest version. Can you try updating the library in the kernel?

UKamath7 · May 5, 2020, 2:36pm

Sorry, this was my first asking question, so from next I’ll be carefull. And yeah, I’ll update the library and check. Thanks for the help.
What to do with this thread? Should I remove it.

JonathanSum · May 5, 2020, 6:59pm

https://colab.research.google.com/drive/1oqMk9ACdiynRYO3UY-30scojtlTNGemz?usp=sharing

Here is mine that works from lesson3 Multi-label prediction with Planet Amazon dataset notebook on Google colab. Just a remind, I did not use the kaggle api, and I just downloaded the dataset and uploaded it to the Google Drive.

It was from the course GitHub website: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-planet.ipynb

UKamath7 · May 6, 2020, 12:31am

I tried this in colab by downloading data, it works fine, thanks

JonathanSum · May 6, 2020, 9:04am

I am glad that I helped you. If you have any question, please feel free to ask. I also suggest that init_27 is right. Maybe his solution can solve the kaggle environment issue.