Middle layers not learning anything. What should I do?

I am training a QA model, which is not giving satisfactory results. The metrics are subpar. Upon investigation, I tried plotting the layer gradients and found this. I have used the function from this thread.

From the figure above, it seems as if the middle layers are not learning anything.
What should I try doing in order to fix this?