2021 saw some significant advances in ML and NLP research.
In a recent post, Sebastien Ruder summarised some of the main themes. These include:
Universal models. Big pre-trained self-supervised models in vision, speech and language have had huge success. These models are trained to predict hidden parts of their inputs. Imagine covering over part of an image and asking somebody to guess what was being covered, or removing the last word of a sentence and trying to guess what the word was. Models trained to do this simple task using loads of data are surprisingly good at a large range of other tasks.
Architectures - most of the universal models use the transformer architecture, but there have been many other innovations in this area. For example the Perceiver and multilayer perceptrons.
Prompts as an alternative to task specific training. Instead of training a classifier to say if a film review was good, fine-tune a universal model to complete the sentence "I thought this film was ___". This can dramatically reduce the amount of data needed. See this paper for a good summary.
Keeping up with all the progress made in 2021 could take the rest of 2022!