Online fine tuning generative recurrent network for sequences


I’m working on a project for generating sequences where I can proceed in two steps: first I can learn a kind of generic “primitives” of these time-series from a large corpus of such series, but then I would like to fine-tune or personalize it with a kind of reinforcement model where I would sample new sequences from this corpus-based model, and then have a custom error function that would tell the network how well it did.

I’m not quite sure how to combine these two stages: is it possible to just learn an LSTM type of network, then start sampling it and keep updating the weights based on some custom error?