I would like to ask for someone to volunteer to agree to share course-related files and datasets such as those hosted at files.fast.ai, especially ones that are larger than a couple of hundred of megabytes using BitTorrent files. You will be providing a great deal of help to students who have slow / unstable / censored internet.
This course is taken by many students from all over the world. In some countries and places, internet connection can be very slow, unstable or even censored. I personally experienced trying to download multiple times data sets from Kaggle only for the connection to be interrupted and I have to start over again. Some data set links are blocked by the government (I live in China). I don’t think China is an exception and there are many other places where internet quality is very poor.
I think this would help not only students with bad internet but also ones that have good internet: by downloading just one torrent file you have all the data and don’t need to do it by hand one by one.
You should try out the Kaggle kernels for fast.ai. It will be possible to run the notebooks without having to download the data locally. That said, I understand that it is better to have the data locally.
This is the same answer that I got when I brought this up with some Kaggle competitions. I understand that it is possible to use remote servers to run code and get access to the data. But this answer is not a solution.
I’m also not sure that fastai v1 will work on Kaggle kernels - it would at least require pytorch v1 being installed there, which would require additional steps.
Hosting bittorrent files sounds like a great idea. Hopefully some students are able to assist there once the course starts. Perhaps the easiest way would be to upload here:
Agree, torrents are a good solution even if we just want to speed up downloading times. It could take quite a lot of time to download a huge dataset or weights.
I can start seeding while I’m back in Moldova with fast internet, before going back to China where I won’t be able to create a torrent due to bad internet. The only problem is I’m not sure which pretrained models and datasets will be used in the course.
@jeremy could you let us know which data and models we are planning to use? Is it a good idea to seed everything from files.fast.ai?
If it is possible to upload them before October 22 then I either can also start seeding or upload them to torrents myself. Internet in Moldova is fast with 100 mbps. After that date I go back to China and internet there is unpredictable.
Just to be sure. files.fast.ai will keep on updating throughout the course or is it as it is now.
If it won’t change throughout course may be will try to create torrent. Never created it though but will try.