- Word Embeddings
- Why are word embeddings useful?
- What are the inputs?
- What are the outputs?
- Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), Long Short Term Memory (LSTM)
- What type of patterns are these models best at learning?
- What is the most important part of the architecture for these models?
- What are the limitations of word embeddings?
- When would you not use a RNN?
- Bonus: Why are Transformers more popular right now than RNN, GRU, LSTM?