Online training of language models

If I want to build a language model that tries to predict a next word when a user is typing text on his phone. I can pretrain such language model on large English texts like Wikipedia but then when I put this model for inference on a user mobile phone I want to improve constantly it’s prediction (personalize it) from the actual texts that the user is typing.
I know how to do that quite naturally with n-grams approaches where you can just constantly update n-gram statistics, but how can you do that with RNN based language models?

You should collect data for some amount of time. And then you can fine-tune it. Just store everything that the user enters.
I don’t think updating the model whenever the user types, is a good choice as it would demand a lot of computation for longer periods of time.

1 Like

But let’s say you already installed it on user mobile device. How would you do the retraining of the network?

Doing finetuning on a mobile device for a language model, wouldn’t it create problems due to the compute power it requires. Jeremy also discussed it in part 1, where he encourages to avoid placing models in the mobile device, instead use cloud. This is how Gmail works, when you type something the new suggestions are not computed on your mobile but on the cloud.

But it is not always the best solution to make your prediction on the server.
For keyboard typing prediction on Android you do not want contact every second your server to get the next prediction and you want your system work also in offline mode.
But possibly some hybrid solution can work, like having prediction NN installed on the device for inference and then collecting user data and periodically sending it to the server for retraining.
But the best solution in this case I think should still be able to learn on the fly.

There are different online training platforms for language learning. It depends which is near you and affordable. You should check all services, features and get trial before paying any subscriptions.

Today, I am part of Nvidia. So, no problems of finding place to train the models :slight_smile: