Local Server GPU Benchmarks

I think what I was trying to say was that I was surprised that a 1070ti was only 4-5x slower. :sweat_smile:

I guess I had some (rather naïve) notion in my head that a 3090 with “so many cores” would be ‘exponentially’ better … and so the thing that surprised me a bit was that the relationship is almost linear (~5x perf for ~5x cores … for not ~5x power draw actually)

But you are absolutely right, 4-5x is quite dramatic especially for the bigger jobs, and I think that’s where cards like 3090 (and soon 4090) shine with their faster, larger VRAM stores and the ability to transfer more data in less time off main RAM and disk storage.

4 Likes

I’m getting the following error running the text classifier code from above. I have updated to the latest version of Fastai locally(2.5.6) so I’m not sure what the issue is…Any thoughts on what this could be?

1 Like

It works for me when I do “from fastai.text.all import *” before creating the dataloader. Just importing the fastai doesn’t seem to do it for me either.

2 Likes

Thanks @mike.moloch. That worked :slight_smile:

2 Likes

Hey folks,
would you please run the Jeremy’s NLP starter notebook (kaggle) on your local server?
The training cell takes 57 seconds to complete on the a6000 (250W, linux) and 7:44 minutes to complete on the 2060 Super (WSL2).
I’m curious about your results.

I’m trying to run it locally and I set my creds and installed kaggle but I get an error . I noticed the notebook doesn’t import anything from the kaggle package. Do I need to import api or something?

      4 if not iskaggle and not path.exists():
----> 5     api.competition_download_cli(str(path))
      6     ZipFile(f'{path}.zip').extractall(path)

NameError: name 'api' is not defined

EDIT:

OK so I got it to work. The code in the notebook wasn’t creating the kaggle.json file properly (it was empty) and then I had to import kaggle in the cell where it downloads the data (where I got the error previously)

BUT

Now I’m getting
tokz.tokenize("A platypus is an ornithorhynchus anatinus.")

AttributeError: 'SentencePieceProcessor' object has no attribute 'encode'
1 Like

check installed version of sentencepiece, i have 0.1.96 and it works
other package versions i have that may be related:

datasets                              1.18.4
huggingface-hub                       0.4.0
transformers                          4.16.2
2 Likes

Install the transformers package with pip and, if required, sentencepiece too :wink:

1 Like

Thanks gents! sentencepiece was at 0.1.86 , upgrading it to latest (0.1.96?) got it to work!

2 Likes

OK, so running the training cell in Jeremy’s Kaggle NLP starter notebook I got the following results:

Time: 6 min, 16 seconds.
GPU card: 1070ti , 180w, 8GB
System: Dell T3600 , Xeon 8c/16t, 64GB RAM, Nvme SSD for disk i/o

1 Like

Interesting. Note how the 1070ti on the bare metal is substantially faster than the 2060 Super in WSL2.

1 Like

Well it is going through the container virtualization ‘layer’ as it were, but still, I would’ve expected it to be faster as the onboard RAM is faster though it has about 400 less cuda cores than 1070ti. Plus the 1070ti system has DDR3 system RAM.

1 Like

I got sentencepiece 0.1.96
and also tried with
datasets 1.18.4
huggingface-hub 0.4.0
transformers 4.16.2

But I am still getting:
AttributeError: ‘SentencePieceProcessor’ object has no attribute ‘encode’
after tokz.tokenize("…")

Any alternative suggestions?

sentencepiece 0.1.96 worked for me too. I’d do a lookup and see if your system is picking up a legacy installed version of the lib? other than that I’m out of ideas.

Based on the error message, you didn’t install that into the env that you’re using for jupyter. It’s still picking up and old version.

1 Like

Right! I was using pip instead of mamba (which I used for Jupyter).
Thanks!

pip should aslo work if installed in the right environment