Back

CPD 143 - Machine learning with Python

Start
13 Nov 2021
End
14 Nov 2021
Schedule
2 sessions
#1.
13 Nov 2021, 9:00 AM 12:00 PM (AEDT)
#2.
14 Nov 2021, 9:00 AM 12:00 PM (AEDT)
Location
Zoom link available upon registration
Spaces left
0

Registration

Both (Half) Days Member – $150.00
Discounted registration for both (half) days for an SSA member.
Both (Half) Days Non-Member – $300.00
Discounted registration for both (half) days for a non-member.
Both (Half) Days Student Member – $75.00
Discounted registration for both (half) days for an SSA student member.
One (Half) Day Member – $100.00
Discounted registration for one (half) day for an SSA member.
One (Half) Day Non-Member – $200.00
Discounted registration for one (half) day for a non-member.
One (Half) Day Student Member – $50.00
Discounted registration for one (half) day for an SSA student member.

Registration is closed

Statistical Society of Australia warmly invites you to a workshop on machine learning with Python, presented by Patrick Robotham from Linktree. This workshop consists of two sessions, on 13th (Saturday) and 14th (Sunday) of November.

Patrick is a Staff Machine Learning Engineer at Linktree. He works to build production ready machine learning and statistical models and has 7 years of experience in industry.

WORKSHOP ABSTRACT

This two day workshop aims to enable data scientists to incrementally incorporate Python in their workflow. After an introduction of Python basics, the workshop focuses on developing Python models in a workflow framework that is most commonly seen in a production environment. Participants will benefit from a gentle introduction to Python on the first day before learning some powerful modelling concepts and tools on the second day.

WORKSHOP CONTENT

Day 1 Getting Started with Python and Pandas

This is a hands-on course for learning the basics of Python and data manipulation with the Pandas library.

We will begin this course with a gentle introduction to the basics of Python like variables assignments and data type conversions. We will then dive into Pandas which is the most popular package for manipulating tabular data in Python. We will end this session by making some basic plots for our data. Throughout the workshop you will program a sequence of Jupyter notebooks and gain experience in working with data in Python.

At the end of this module you will be able to:

Understand the basic data types in Python and how to convert between them.
Use the Python libraries pandas to import and manipulate data.
Use matplotlib to make basic visualisations on data.

Day 2 Introduction to Machine Learning

This workshop will teach you how to use the scikit-learn library to construct regression/classification models, tuning model parameters and evaluating model performance.

The scikit-learn library supports most of the standard classification, regression and clustering models that we regularly use everyday as statisticians and data scientists. In addition, scikit-learn offers a unique “workflow” framework that can wrap most data manipulations, scaling, imputations, tuning and evaluation together, which provides a consistent standard for machine learning model deployment.

The workshop will cover:

Use the Python libraries pandas and numpy to import and manipulate data.
Use scikit-learn to construct linear and tree-based models.
Know the difference between classification and regression.
Evaluate a predictive model with appropriate metrics and plots.
Improve a machine learning model using hyperparameter tuning.
Perform necessary scalings and imputation on the data.
Standardisation of model deployment using pipelines.

Timetable

Day 1

Time	Task	Outcome
09:00	1. Running and Quitting	How can I run Python programs?
09:15	2. Variables and Assignment	How can I store data in programs?
09:35	3. Data Types and Type Conversion	What kinds of data do programs store? How can I convert one type to another?
09:55	4. Built-in Functions and Help	How can I use built-in functions? How can I find out what they do? What kind of errors can occur in programs?
10:20	5. Morning Coffee	Break
10:35	6. Libraries	How can I use software that other people have written? How can I find out what that software does?
10:55	7. Reading Tabular Data into DataFrames	How can I read tabular data?
11:15	8. Pandas DataFrames	How can I do statistical analysis of tabular data?
11:45	9. Plotting	How can I plot my data? How can I save my plot for publishing?

Day 2

Time	Task	Outcome
09:00	1. Quick revision and set up	A quick recap of Day 1
09:10	2. Regression Models	What is a regression model and how can we fit one using scikit-learn?
09:35	3. Classification Models	What is a classification model and how can we fit one using scikit-learn?
09:55	4. Dummy encoding, scaling and imputation	What kind of manipulations should we apply to our data before we can fit a model?
10:20	5. Morning Coffee	Break
10:35	6. Cross Validation	How is cross validation used to evaluate model performance?
10:55	7. Hyperparameter Tuning	How can we make our model more accurate and flexible?
11:15	8. Pipelines	How can we wrap all preprocessing steps and model tuning and evaluations under a consistent framework?
11:45	9. Revision	Q&A and reserved time for participants

Expenses:

Occasionally workshops have to be cancelled due to a lack of subscription. Early registration ensures that this will not happen. Please note that the Society will not be held responsible for any financial loss incurred due to a workshop cancellation.

Financial Support:

Financial support for SSA Vic members can be sought. For further information, please see https://statsoc.org.au/News-and-media-releases/10424132.

Contact:

Please contact the organisers: Patrick Robotham (patrick.robotham2@gmail.com) and Kevin Wang (kevinwangstats@gmail.com) for further details.

Statistical Society of Australia (SSA)

PO Box 213

Belconnen ACT 2616 Australia

02 6251 3647

www.statsoc.org.au

ABN 82 853 491 081

Please direct enquiries to:

the SSA Team via email at

contact@statsoc.org.au

@StatSocAus

Privacy Security Sitemap

Website by Converge Design