Simple RNN Activations one one input and another for hidden

I’ve never seen a non linear activation applied to the input data prior to combining it with the hidden weights.

I would have expected it to be something like

tanh(x_inputx_weights + hidden_inputhidden_weights)

however it looks like,

tanh(relu(x_inputx_weights) + hidden_inputhidden_weights)

Is this because the input is an embedding, or just a slight tweak that may or may not work better?