Introduction

Completed

Classification is a form of machine learning in which you train a model to predict which category an item belongs to. For example, a health clinic might use diagnostic data such as a patient's height, weight, blood pressure, or blood-glucose level to predict whether the patient is diabetic.

Illustration showing the use of data to identify patient conditions.

Categorical data has distinct classes, rather than numeric values. Some kinds of data can be either numeric or categorical. For example, race completion times could be measured in seconds or minutes, or the times could be separated into classes called fast, medium, and slow. Other kinds of data can only be categorical. For example, a shape can be categorized only as, say, circle, triangle, or square.

Prerequisites

  • Knowledge of basic mathematics
  • Some experience programming in R

Learning objectives

In this module, you'll learn:

  • When to use classification.
  • How to train and evaluate a classification model by using the tidymodels framework.

Produced in partnership with Eric Wanjau - Microsoft Learn Student Ambassador and Researcher/Data Scientist: Leeds Institute for Data Analytics, University of Leeds