Slow predictions on Windows 10

I trained an ULMFIT model on Google Colab GPU. When I test predictions on Colab with GPU, I get about 1000 short text documents/second. On Google Colab with CPU, I can predict at about 140 documents/seconds. However, on Windows 10 with CPU with the same model and same code, I can predict only 18 documents/second, so Colab CPU is 8x faster.

On Windows, I get this ominous warning

 C:\Temp\hug\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input
   warn('Tensor is int32: upgrading to int64; for better performance use int64 input')

On both Colab and Windows, I use torch version 1.4.0 and fastai 1.0.60.

Any advice on how to speed up Windows to be more like Colab? Running on Windows would make deploying my model easier because my IT department prefers Microsoft.

A lot of that would depend on the hardware. What CPU do you use? An example is colab has:

Intel® Xeon® CPU @ 2.20GHz

You can check this via:
!cat /proc/cpuinfo

@muellerzr

Google Colab CPU: Intel® Xeon® CPU @ 2.30GHz
Windows 10 CPU: Intel® Core™ i7-8650U CPU @ 1.90GHz, 2112 Mhz

According to Passmark, benchmarks are
Google Colab: 8909, or 1403 per thread
Windows: 8679, or 1239 per thread

On Windows when running predict() on single documents, CPUs are 100% busy. When calling get_preds() or pred_batch(), CPUs are about 25% busy, even though Python is launching multiple processes.

I also get a ‘Tensor is int32: upgrading to int64; for better performance use int64 input’ warning when doing predict on an ULMFiT model, are there any fixes/workarounds for this?