Joint ELLIIT Distinguished Lecture: The Prevalence of Errors in Machine Learning Experiments
Title: The Prevalence of Errors in Machine Learning Experiments
Speaker: Martin Shepperd, Gothenburg University/Brunel University London
When: Monday, May 16, 13:15–14:00
Location: Matteannexet room MA:4, Sölvegatan 20, Lund
Computational experiments are the dominant paradigm to understand and compare machine learning algorithms. Typically, multiple learning algorithms (the treatments) are compared over multiple datasets that provide training and validation subsets using various predictive performance metrics, i.e., the response variables. Such experimental designs are referred to as repeated-measure designs. This way we build knowledge through sense-making of many results. But we need to be sure our experimental results are reliable. I answer this question by examining the domain of software defect prediction. A re-analysis of experiments found ~40% contained inconsistent results and/or basic statistical errors. Elsewhere I show that inappropriate response metrics can not only change the magnitude of results but also the direction of effects in ~25% of cases.
We all make errors, and there can be considerable complexity in our computational experiments, so I recommend (i) use open science to expose studies to scrutiny, (ii) try to avoid dichotomous inferencing methods and (iii) use meta-analysis with caution!
Martin Shepperd is the 2022 Swedish Tage Erlander research professor funded by the Swedish Research Council – the first Computer Science holder of this professorship since its inception 1982. This year the professorship is placed at Gothenburg University, and he also has the chair of Software Modelling & Technology at Brunel University London. He has a BSc in Economics, and an MSc and PhD in Computer Science. He worked as a software developer for HSBC before returning to academia. He has published 3 books and more than 180 refereed research articles in the areas of software engineering and machine learning.
He is a fellow of the British Computer Society.