Presentation of Di Cook Award projects
The Di Cook Award, established in the name of Professor Dianne Cook at Monash University who is one of the world's leading authorities on data visualization, is an open-source statistical software award for students (or recent graduates) of Victorian and Tasmanian institutes. The inaugural Di Cook Award for 2021 was announced earlier during the March event hosted by the Victorian Branch of the Statistical Society of Australia (SSA Vic). The winner of the Award, as well as two students with honorable mentions, will be presenting their award-winning projects in the upcoming May event:
Date: May 17, 2022 (Tuesday)
Time: 5:30 - 6:30 pm
Venue: Evan Williams Theatre in the Peter Hall Building of the School of Mathematics and Statistics, University of Melbourne
Format: Both in-person with zoom live-streaming
In-person attendees have a chance to win a Great Australian Statisticians T-Shirt!
AWARD WINNER: Weihao (Patrick) Li
Bio: Patrick is a second-year PhD student in the Department of Econometrics and Business Analytics, Monash University. He is currently working on automated visual inference using computer vision model with Prof. Di Cook and Dr. Emi Tanaka. His research interests are machine learning, computer vision, data visualisation and statistical software development.
Abstract: Spotoroo is an open-source R package which offers a spatiotemporal clustering algorithm to organise satellite hotspot data for the purpose of tracking bushfires remotely. This work is motivated by the catastrophic bushfires in Australia throughout the summer of 2019-2020 and made possible with the availability of satellite hotspot data. The algorithm is inspired by two existing spatiotemporal clustering algorithms but makes enhancements to cluster points spatially in conjunction with their movement across consecutive time periods, and allows for the adjustment of key parameters if required, for different locations and satellite data sources.
HONORABLE MENTION 1: Jeffrey Pullin
Bio: Jeffrey recently graduated from a Master of Science (Mathematics and Statistics) degree at the University of Melbourne, receiving the Dwight prize as the best performing student in statistics. He is currently working as a research assistant in Davis McCarthy's statistical bioinformatics group at the St Vincent's Institute of Medical research. Before studying his MSc he completed a BSc, also at the University of Melbourne, and worked as a graduate at the Australian Institute of Health and Welfare, in the Australian Public Service.
Abstract: A common task in health and medicine is the classification of patient information into one of several categories by a trained expert. This could include assessing the presence and type of a tumor from a medical image or providing a disease diagnosis from a series of medical tests. Often such judgements are hard to make and error prone: two experts may rate the same scenario differently or the same expert may provide alternative ratings of the same scenario when rating it multiple times.
In this talk, Jeffrey will describe the R package rater, which implements statistical models designed to analyse the so-called repeated categorical rating data. Specifically, rater implements Bayesian versions of several variants of the Dawid-Skene model (Dawid and Skene 1979). Inference is performed using the probabilistic programming language Stan, marginalising out discrete parameters from the models. He will highlight the various data input, data summarisation, diagnostic, plotting and model comparison features implemented in rater.
In addition, his talk will briefly highlight more recent work, inspired by his experience implementing rater, which seeks to understand the computational impact of marginalizing out discrete parameters in Bayesian computations.
HONORABLE MENTION 2: Sayani Gupta
Bio: Sayani is a statistician and recently finished her PhD from the Department of Econometrics and Business Statistics, Monash University. She is currently working as a Research Fellow with her supervisors Prof. Rob Hyndman and Prof. Dianne Cook. Previously, she has done her Masters and Honors in Statistics back in India, where she also worked as a consultant and senior analyst in firms like KPMG and American Express. Her research interests include visualization, computational statistics, time series, forecasting and data analysis. She enjoys using Statistics to solve real-world problems and almost always uses R to do that.
Abstract: Several classes of time deconstructions exist, resulting in alternative data segmentation and, as a result, different visualizations that can aid in the identification of underlying patterns. Cyclic time granularities like hour of the day, day of the week, or special holidays can be used to create a visualisation of the data to explore for periodicities, associations, and anomalies. Analysts are expected to comprehensively explore the many ways to view and analyze such graphics, however, the lack of a systematic approach to do so becomes overwhelming due to plethora of choices. The package provides tools to compute possible cyclic granularities from an ordered (usually temporal) index and also a framework to explore the distribution of a univariate variable conditional on one or two cyclic time granularities by defining “harmony”. A “harmony” denotes pairs of granularities that could be effectively analyzed together and reduces the search from all possible options. The search can be narrowed down further for informative granularities by selecting those graphics for which the differences between the displayed distributions are greatest and also rating them in order of importance of capturing maximum variation.