This meeting featured two very interesting presentations by Emily Whitney and Andrew Grose and was the second of our monthly meetings to be held by Zoom.
When one is unfamiliar with the venue there is always the possibility of getting lost when trying to find a meeting, and your correspondent managed to achieve this, finding himself in another Zoom meeting room – along with three or four others who had also followed the wrong link. A virtual search party retrieved us in time and your correspondent was able to bring the lost people back to the meeting, bringing the numbers to more than 25.
Emily graduated mid 2019 with First Class Honours in Mathematical Sciences from Curtin University. Her dissertation, supervised by Aloke Phatak, focussed on regularisation penalties for categorical predictors with the application of predicting stillbirth. With the support of WA Branch of the Statistical Society through their Honours Scholarship, she was able to present her work at the International Workshop on Statistical Modelling in July 2019. Recently Emily began work as consultant data scientist at EY and now works in the health analytics space.
Emily spoke about her work on the problem from her dissertation, comparing use of LASSO, a group-wise LASSO, and a structured fusion penalty in logistic regression. She related this to prediction of stillbirth, using a large data set capturing information on all singleton births in WA in the years 1980 through 2015. All three penalties are L1 in style, and hence encourage sparsity. The group LASSO also encourages similarity between coefficient estimates that ought to go together. The structured fusion penalty in addition encourages satisfaction of desired constraints (such as monotonicity for coefficients of ordinal factors, or equality of coefficients of a factor) – so you discourage differences that are not really interesting to you, unless the data force you to see these differences. The take-home message was that structured fusion penalties offer a tool to make categorical predictors more interpretable.
Andrew Grose graduated at the end of 2019 from Murdoch University with first class honours and now works for SAGI-West. He talked about comparisons of robust methods for identifying outliers, which is ongoing research he has been doing with Brenton Clarke following completion of his honours degree. Brenton supervised his honours thesis. In his talk, Andrew examined in detail differences between the abilities of a variety of strategies for outlier identification, including multivariate ATLA (Adaptive Trimmed Likelihood Algorithm), which was a strong performer, FSM (Forward Search method), BACON, and others. He also discussed concepts such as swamping and masking of outliers.
Naturally, Tukey and Huber got a mention, and more surprisingly, so did Bradman. Although in retrospect he would have to be a clear illustration of the fact that outliers do not necessarily indicate problems with the data item (unless there was a conspiracy of scorers in Test cricket matches).
Both talks were followed by questions and applause (sometimes the sound of a muted hand clapping) but not, in view of the circumstances, by dinner out with the speakers. This omission will be rectified at a later meeting, face-to-face.
Alun Pope