Machine Learning: From damned lies to statistics, where does machine learning lie within the field of data science?
The monthly seminar of the SSA in SA was presented by Dr Oscar Perez-Concha, Lecturer at the Centre for Big Data Research in Health (CBDRH), University of New South Wales, Sydney. Oscar is also the founder of the CBDRH Machine Learning Club, a special interest group which meets weekly to discuss ideas about machine learning and its application to health data science. CBDRH Machine Learning Club is an open forum and people interested in joining online or in person could contact Oscar directly at o.perezconcha@unsw.edu.au.
I have been part of the special interest group on Machine Learning Club for now about two years I was very much looking forward to Oscar’s presentation. Over 40 people connected from all over Australia using Zoom. Lots of people in many fields are at least curious and would like to learn more about Machine Learning and the talk titled ‘Machine Learning: From damned lies to statistics, where does machine learning lie within the field of data science?’ was the perfect opportunity to get started.
I thoroughly enjoyed how before delving into the talk, Oscar provided a quick overview of his pathway from engineering to ML expert and examples of ML applications provided by his past and current projects.
His presentation started with recounting his own journey of marrying ML and statistics and moved on to discussing the similarities and differences between the two in terminology and aims. Critical ML events like Gauss derivation of the normal distribution, Turing cracking the wartime Enigma code and the release of R package were highlighted over the timeline of statistics. What followed was the introduction of the founders of Artificial Intelligence AI and the timeline of AI. Frankly, as a female researcher I always get some satisfaction hearing about Ada Lovelace. So much I almost named my daughter after her.
Oscar gave a remarkable overview of ML theory from random forests to support vector machines, concluding with deep neural networks. In doing so he pointed out the dichotomies between statistics and ML but never in a way to keep them separate, always creating bridges between the disciplines by understanding the different nuances in language.
The discussion was vivacious, I particularly enjoyed hearing more about using machine learning in a causal inference scenario and in the area of longitudinal data. Check out the video to hear what he says.
Oscar spoke very frankly of the limitations of ML which is not a box to fix all data problems. He clearly stated what stage ML currently is (Association) and where it hasn’t gone yet (Intervention or Counterfactual world).
Just after the meeting I was on the phone with Oscar to congratulate him on the brilliant presentation and cheer him on his efforts. Pretty soon we were talking about new ideas for presentations and workshops and even collaborating in person once the COVID-19 restrictions ease a little more. Something we both look forward to.
By Barbara Toson