1 - Introduction

Machine learning with tidymodels

Welcome!

Wi-Fi network name

tktk

Wi-Fi password

tktk

Who are you?

  • You can use the magrittr %>% or base R |> pipe

  • You are familiar with functions from dplyr, tidyr, ggplot2

  • You have exposure to basic statistical concepts

  • You do not need intermediate or expert familiarity with modeling or ML

Who am I?

Many thanks to RStudio tidymodels team, Alison Hill, and Allison Horst for their role in creating these materials!

Asking for help

πŸŸͺ β€œI’m stuck and need help!”

🟩 β€œI finished the exercise”

Plan for this workshop

  • Your data budget
  • What makes a model
  • Evaluating models
  • Feature engineering
  • Tuning hyperparameters
  • Wrapping up!

Introduce yourself to your neighbors πŸ‘‹

What is machine learning?

What is machine learning?

What is machine learning?

Your turn

How are statistics and machine learning related?

How are they similar? Different?

03:00

What is tidymodels?

library(tidymodels)
#> ── Attaching packages ──────────────────────────── tidymodels 1.0.0 ──
#> βœ” broom        1.0.0     βœ” rsample      1.0.0
#> βœ” dials        1.0.0     βœ” tibble       3.1.8
#> βœ” dplyr        1.0.9     βœ” tidyr        1.2.0
#> βœ” infer        1.0.2     βœ” tune         1.0.0
#> βœ” modeldata    1.0.0     βœ” workflows    1.0.0
#> βœ” parsnip      1.0.0     βœ” workflowsets 1.0.0
#> βœ” purrr        0.3.4     βœ” yardstick    1.0.0
#> βœ” recipes      1.0.1
#> ── Conflicts ─────────────────────────────── tidymodels_conflicts() ──
#> βœ– purrr::discard() masks scales::discard()
#> βœ– dplyr::filter()  masks stats::filter()
#> βœ– dplyr::lag()     masks stats::lag()
#> βœ– recipes::step()  masks stats::step()
#> β€’ Use tidymodels_prefer() to resolve common conflicts.

Let’s install some packages

install.packages(c("doParallel", "ranger", "rpart", 
                   "rpart.plot", "tidymodels", "tidyverse",
                   "vetiver", "xgboost"))