Natural Language Processing: From one-hot vectors to billion parameter models by Pascal Janetzky