Python is one of the world's most popular programming languages. It's used extensively in the data science community for machine learning and statistical analysis. One of the reasons it's so popular is the availability of thousands of open-source libraries such as NumPy, Pandas, Matplotlib, and scikit-learn, which enable programmers and researchers alike to explore, transform, analyze, and visualize data.

Azure Notebooks is a cloud-based platform for building and running Jupyter notebooks. Jupyter is an environment based on IPython that facilitates interactive programming and data analysis using Python and other programming languages. Azure Notebooks provide Jupyter as a service for free. It's a convenient way to write Python code without having to install and manage a Jupyter server. And it's web-based, making it an ideal solution for collaborating online.

In this module, you'll create an Azure Notebook, import a dataset containing on-time arrival information for a major U.S. airline, and load the dataset into the notebook. Then, you'll clean the dataset with Pandas, build a machine-learning model with scikit-learn, and use Matplotlib to visualize output from the model.

Learning Objectives

In this module, you will:

  • Create a Jupyter notebook in Azure Notebooks, import data, and view data loaded into the notebook.
  • Use Pandas to clean and prepare data to be used for the machine-learning model.
  • Use scikit-learn to create the machine learning model.
  • Use Matplotlib to visualize the model's performance.