Another architecture that is very powerful, especially in “sequence-to-sequence” problems (that is, problems where the dependent variable is itself a variable-length sequence, such as language translation), is the Transformers architecture. You can find it in a bonus chapter on the book’s website.
There is a short passage in chapter 12 of the book that sounds promising. Maybe we can hope for transformers being included in later parts of the course