O

Services

Case studies

Contact

27.09.20

Making language models more efficient

featured image thumbnail for post Making language models more efficient

In the last few years, transformer based language models like GPT-3, Bert and others have revolutionised natural language processing. Here is how to make them more efficient.

The predictive text function on your mobile phone is a type of language model. Recent advances and massive amounts of compute and data have made modifed versions of these good at a range of tasks. They can do tasks like question answering, translation and language generation at near human levels. However, a big drawback is that the cutting edge approaches are too big and slow to use. GPT-3 has 175 Billion parameters and is too large to run on all but the most expensive supercomputers.

This paper is a survey of the huge amount of research that has gone into making these model more efficient. It provides a taxonomy for understanding the literature and an overview of the most promising approaches.

←Previous: Machine learning engineering book

Next: How much computation does a brain need?→


Keep up with the latest developments in data science. One email per month.

ortom logoortom logoortom logoortom logo

©2025

LINKEDIN

CLUTCH.CO

TERMS & PRIVACY