In the last few years, transformer based language models like GPT-3, Bert and others have revolutionised natural language processing. Here is how to make them more efficient.
The predictive text function on your mobile phone is a type of language model. Recent advances and massive amounts of compute and data have made modifed versions of these good at a range of tasks. They can do tasks like question answering, translation and language generation at near human levels. However, a big drawback is that the cutting edge approaches are too big and slow to use. GPT-3 has 175 Billion parameters and is too large to run on all but the most expensive supercomputers.
This paper is a survey of the huge amount of research that has gone into making these model more efficient. It provides a taxonomy for understanding the literature and an overview of the most promising approaches.