Very little progress has been made in modelling electronic healthcare data in recent years.
Over the last decade huge improvements have been made in two key areas of machine learning: vision and language. But the holy grail for many is for machine learning to have an equally large impact on a field like medicine. Recent advances of medical machine learning have tended to be on vision based tasks such as radiology and other diagnostics.
However, medical applications of ML on electronic patient records data have not seen a similar leap forward.
A new paper by a David Bellmay and colleagues argues some of this is due to a lack of good benchmarks:
Through our meta-analysis, we find that the performance of deep recurrent models is only superior to logistic regression on certain tasks.
Useful modelling using healthcare data is important but hard to do. There are issues with data access, bias, quality and volume. It is hard to validate and apply models in the real world as you need to be pretty sure what you have built works (unlike in e-commerce).
This paper is a useful contribution but I fear we need more than just benchmarks to drive a revolution.