Text classification for Naive Bayes

Beginner
Developer
Student
Data Scientist
Azure

Learn how to use conditional probability and Bayes classification to analyze an actual dataset of email messages. You'll learn how to use and apply these machine learning principles as you analyze an email dataset for spam and ham.

Learning objectives

In this module, you will:

  • Learn the strengths and limitations of conditional probability and Naive Bayes machine learning.
  • Use pandas to evaluate a dataset of email messages for spam or ham.
  • Use Matplotlib to generate a word cloud, and then use Natural Language Toolkit and scikit-learn for deeper analysis of the data.
  • Use the Naive Bayes classifier to classify the data in a spam dataset and improve the accuracy of a machine learning model. This is complementary content for Microsoft Reactor Workshops.

Prerequisites

  • Introduction to Python for data science