I recently found on Cornell #arXive a new pre-print (2023) on #RNN and #LSTM by Alex Sherstinsky of MIT. Through the years, I've read numerous papers on RNNs, starting with Rumelhart's 1986 paper. But this one is, by far, the most detailed tutorial not only on RNNs but also on LSTMs.
The complete derivations of both forward (inference) and backward (training) passes of the learning algorithm use only basic calculus and matrix algebra, drawing intuitive analogies to digital signal processing #DSP. And the equations are complete and detailed enough to be implemented by the student, directly in software. In my opinion, every undergrad EE and CS studying #DeepLearning #NeuralNetworks should read this superb introduction.
https://arxiv.org/pdf/1808.03314.pdf