| Join Now |
Statistical Computing and Visualisation Tutorials |
This is for data scientists wanting to improve their modeling skills, and gain a better understanding of machine learning models.
About the presenter: Max Kuhn is a software engineer at Posit. He has been instrumental in improving the modeling capabilities in R. Max has a Ph.D. in Biostatistics and is the author of numerous R packages for techniques in machine learning. He has a background in the pharmaceutical industry. With Kjell Johnson, his book Applied Predictive Modeling, won the 2015 Ziegel award from the American Statistical Association, for the best book reviewed in Technometrics. He is the co-author of the book Tidy Modeling with R available at https://www.tmwr.org/.
Background: Participants should have a good working knowledge of R, and be familiar with basic data wrangling and visualisation as described in R for Data Science by Wickham and Grolemund (2016).
The tutorial will follow selected materials from https://www.tidymodels.org/ and https://www.tmwr.org/ and will be made available a few days prior to the tutorial.
This is for research scientists wishing to visualise uncertainty on maps and keen to gain a greater understanding of visualisations available in the Vizumap R package and how to generate them. The objectives for the tutorial are:
The structure follows:
9:00-9:05 Welcome and Introductions
9:05-10:00 Session 1: Introduction to the Vizumap R package and methods for visualising uncertainties on maps (incl Q&A)
10:00-10:15 BREAK
10:15-11:00 Session 2: Hands on tutorial that uses Vizumap to visualise uncertainties on maps using the GBR pollutant load modelling outputs
11:00-11:15 BREAK
11:15-12:00 Session 3: Breakout session for groups to work on one of 3 problems
12:00-12:25 Presentations from each group and discussion around the problem, choice of map and features and experience with Vizumap.
12:25-12:30 Close and final comments.
About the presenter: Petra Kuhnert is the Associate Science Director – University Engagement at CSIRO’s Data61. She has a PhD in Statistics from QUT, focussing more recently on the development of methods at the intersection between statistics and machine learning to develop more informative predictions for risk-based decision-making. In 2023, Petra was a recipient of two Women in AI (WAI) awards for her modelling work related to the Great Barrier Reef: 1st runner-up for the APAC Women in AI Innovator of the Year Award, and the APAC Women in AI Environment and Biodiversity Award 2023. Petra co-designed the award-winning software, "Vizumap" with Lydia Lucchesi. Vizumap is an innovative visualisation package written in the R programming language. More recently her focus has been towards the development of emulation methods using machine learning to speed up slow running physical systems and the use of space technologies (e.g. remote sensing) to enhance the predictions of terrestrial environmental and agricultural processes.
Background: Participants should have a good working knowledge of R, and some background in modelling, particularly models of spatial and/or spatio-temporal processes and the data underpinning those.
More details can be found at https://statsocaus.github.io/tutorial_vizumap/. Materials will be provided a few days prior to the tutorial.
This is for scientists and data science practitioners who regularly work with high-dimensional data and models and are interested in learning how to better visualise them. You will learn about recognising structure in high-dimensional data, including clusters, outliers, non-linear relationships, and how this can be used with methods such as supervised classification, cluster analysis and non-linear dimension reduction. The course will be structured as follows:
1:00-1:20 Introduction: What is high-dimensional data, why visualise and overview of methods
1:20-1:45 Basics of linear projections, and recognising high-d structure
1:45-2:30 Effectively reducing your data dimension, in association with non-linear dimension reduction
2:30-3:00 BREAK and PRACTICAL EXERCISES
3:00-3:45 Understanding clusters in data using visualisation
3:45-4:30 Building better classification models with visual input
About the presenter: Dianne Cook is Professor of Business Analytics at Monash University in Melbourne, Australia. She is a world leader in data visualisation, especially the visualisation of high-dimensional data using tours with low-dimensional projections, and projection pursuit. She is currently focusing on bridging the gap between exploratory graphics and statistical inference. Di is a Fellow of the American Statistical Association, past editor of the Journal of Computational and Graphical Statistics, current editor of the R Journal, elected Ordinary Member of the R Foundation, and elected member of the International Statistical Institute.
Background: Participants should have a good working knowledge of R, and some background in multivariate statistical methods and/or data mining techniques.
More details can be found at https://statsocaus.github.io/tutorial_highd_vis/. Materials will be provided a few days prior to the tutorial.
This is for data science practitioners who write python or R code to build predictive models, and are interested in industry best practice for converting experimental code into reliable ML services for their users,
Presenter: Dean Marchiori is Director and Principal Data Scientist at Wave Data Labs where he consults on statistical modelling, applied mathematics and advanced analytics. Dean holds a BSc. in Mathematics with University Medal from Charles Sturt University, a Master of Applied Finance degree, and a Master of Applied Statistics from Macquarie University where he was awarded the Julian Leslie Prize in Statistics. He has been named one of the top 10 analytics leaders in Australia by the Institute of Analytics Professionals of Australia (IAPA). He is also recognised as an Accredited Statistician with the Statistical Society of Australia where he is co-chair of the committee for Statistical Computing and Visualisation.
Details:
Background: Active participation will require users to sign up for an account with Microsoft Azure. This is a paid services, but Microsoft offers new users with a $200 USD credit to use within 30 days, https://azure.microsoft.com/en-au/free.
Link to materials will be made available to registered participants, before the tutorial.
This is for data science practitioners who are interested in writing and deploying web apps with shiny.
Presenter: Mitchell O'Hara-Wild is an award-winning teaching associate and research assistant at Monash University with particular expertise in R package development, data analysis and statistical computing. He holds a Bachelor of Commerce (Honours) with a major in econometrics, and a Bachelor of Science majoring in computational science and mathematical statistics. Mitchell is the developer of several widely used R packages and operates a data consultancy that runs workshops and provides data analysis and tools.
Details:
Background: Participants are expected to be familiar with writing R code, but no prior experience in website development or shiny is required to get the most out of this workshop. A http://shinyapps.io/ account is needed to deploy your app in session 3 (but is not required if you just want to watch this part).
Link to materials will be made available to registered participants, before the tutorial.
This is for statisticians and data science practitioners who are interested in improving their data visualisation skills.
Presenter: Dianne Cook is Professor of Business Analytics at Monash University in Melbourne, Australia. She is a world leader in data visualisation, especially the visualisation of high-dimensional data using tours with low-dimensional projections, and projection pursuit. She is currently focusing on bridging the gap between exploratory graphics and statistical inference. Di is a Fellow of the American Statistical Association, past editor of the Journal of Computational and Graphical Statistics, current editor of the R Journal, elected Ordinary Member of the R Foundation, and elected member of the International Statistical Institute.
Details:
Background: Participants should have a good working knowledge of R, and tidyverse, and some experience with ggplot2. Familiarity with the material in R4DS is helpful.
Link to materials will be made available to registered participants, before the tutorial.