Anyone explored Rapids.ai yet?

botkop · May 23, 2020, 8:52pm

I’m super impressed by the performance of T-SNE and UMAP.

rapids=0.13 python=3.7 cudatoolkit=10.2

If you want to install rapidsai and fastai in the same conda environment, then the following worked for me:

create a new conda env, and activate it
install fastai with conda: conda install -c pytorch -c fastai fastai
install rapidsai on top of it: conda install -c rapidsai -c nvidia -c conda-forge -c defaults rapids=0.13 python=3.7 cudatoolkit=10.2

The other way round did not work for me. I am mainly using the tabular models of fastai.

Legnica1241 · September 10, 2020, 5:16am

Hey
Is there a tutorial series or a book to help get started with rapidsai ? Anything you know of would be helpful.

Regards

Even · September 14, 2020, 11:57pm

A book is being worked on, but in the meantime the webpage has a good getting started guide: https://rapids.ai/start.html

Our blog is also a good place to take a look: https://medium.com/rapids-ai

In general it’s very similar to pandas. With dask support GPU memory is now no longer a limitation so you can iterate over data of arbitrary size which is amazing. It works like magic.

Finally, I’ve got a team work on a project https://github.com/NVIDIA/NVTabular that covers dataloaders for PyTorch and Tensorflow. We’ll have a new release shortly. This sprint we’ll be working on a fastai2 integration.

pierreguillou · November 12, 2020, 1:06pm

Hello @datametrician and @Even,

I’m doing research on NLP with transformers models and fastai v2. Therefore, as training time is a key point when the training dataset has plenty of Go of data, I’m very interested in using RAPIDS.ai (my GPU is one NVIDIA V100 32 Go).

I read your article “Accelerating Deep Learning Recommender Systems by Over 15x Using RAPIDS, PyTorch and fast.ai” of September 2019. My global understanding is that 1 year ago, it was needed to create a special Dataloaders class in order to use RAPIDS.ai with fastai v2 (from your post “BatchDataloader and BatchDataset replace the vanilla Dataloader and Dataset functions” and your notebook).

What about today? The use of RAPIDS.ai with fastai v2 is easier? Thank you.

Even · November 16, 2020, 9:19pm

Hi @pierreguillou, thanks for checking us out!

We’ve made a lot of progress on the dataloader, and have a new version that’s available on our repo. Our 0.3 release will make it official but everything is working now if you pull from main.

Here’s an example of running Rossman with dataloading in Tensorflow, PyTorch and Fast.ai v2.

It’s split up a bit strangely but we were trying to show all three in the same notebook. The v2 code is much simpler to integrate now. We just instantiate our own dataloaders for train and validation sets and then use the TabularDataLoaders wrapper to turn them into a databunch.

We’ll have a blog post on this coming out in a few weeks when we release but let us know if you have any questions or issues.

warm regards,
Even

pierreguillou · November 27, 2020, 12:30pm

Thanks @Even but the link gives an error 404.

Even · November 28, 2020, 6:18am

Sorry after review we decided to separate the different training frameworks. Here’s the link: https://github.com/NVIDIA/NVTabular/blob/main/examples/rossmann/rossmann-store-sales-fastai.ipynb

There’s a preproc script in the Rossman folder which also needs to be run that does the feature engineering.