Foundations of data science for machine learning

AI Engineer
Data Scientist

Microsoft Learn provides several interactive ways to get an introduction to classic machine learning. These learning paths will get you productive on their own, and also are an excellent base for moving on to deep learning topics.

From the most basic classical machine learning models, to exploratory data analysis and customizing architectures, you’ll be guided by easy to digest conceptual content and interactive Jupyter notebooks, all without leaving your browser.

Choose your own path depending on your educational background and interests.

✔ Option 1: The complete course: Foundations of data science for machine learning

This is the recommended option for most people. It has all the same modules as the other two learning paths with a custom flow that maximizes reinforcement of concepts. If you want to learn about both the underlying concepts as well as how to get into building models with the most common machine learning tools this is the path for you. It's also the best path if you plan to move beyond classic machine learning, and get an education in deep learning and neural networks, which we only introduce here.

✔ You are currently on this path, scroll down to begin.

Option 2: The Understand data science for machine learning learning path

If you are looking to understand how machine learning works and don't have much mathematical background then this path is for you. It makes no assumptions about previous education (other than a light familiarity with coding concepts) and teaches with code, metaphor, and visual that give you the ah ha moment. It's hands-on, but focuses more on understanding fundamentals and less on the power of the tools and libraries available.

Option 3: The Create machine learning models learning path

If you already have some idea what machine learning is about or you have a strong mathematical background you may best enjoy jumping right in to the Create Machine Learning Models learning path. These modules teach some machine learning concepts, but move fast so they can get to the power of using tools like scikit-learn, TensorFlow, and PyTorch. This learning path is also the best one for you if you're looking for just enough familiarity to understand machine learning examples for products like Azure ML or Azure Databricks.



Modules in this learning path

A high-level overview of machine learning for people with little or no knowledge of computer science and statistics. You’re introduced to some essential concepts, explore data, and interactively go through the machine learning life-cycle - using Python to train, save, and use a machine learning model, just like in the real world.

Supervised learning is a form of machine learning where an algorithm learns from examples of data. We progressively paint a picture of how supervised learning automatically generates a model that can make predictions about the real world. We also touch on how these models are tested, and difficulties that can arise in training them.

The power of machine learning models comes from the data that is used to train them. Through content and exercises, we explore how to understand your data, how to encode it so that the computer can interpret it properly, how to clean any errors, and tips that will help you create high performance models.

Data exploration and analysis is at the core of data science. Data scientists require skills in programming languages like Python to explore, visualize, and manipulate data.

Regression is arguably the most widely used machine learning technique, commonly underlying scientific discoveries, business planning, and stock market analytics. This learning material takes a dive into some common regression analyses, both simple and more complex, and provides some insight on how to assess model performance.

When we think of machine learning, we often focus on the training process. A small amount of preparation before this process can not only speed up and improve learning, but also give us some confidence about how well our models will work when faced with data we have never seen before.

Regression is a commonly used kind of machine learning for predicting numeric values.

Classification means assigning items into categories, or can also be thought of automated decision making. Here we introduce classification models through logistic regression, providing you with a stepping-stone toward more complex and exciting classification methods.

More complex models often can be manually customized to improve how effective they are. Through exercises and explanatory content, we explore how altering the architecture of more complex models can bring about more effective results.

How do we know if a model is good or bad at classifying our data? The way that computers assess model performance sometimes can be difficult for us to comprehend or can over-simplify how the model will behave in the real world. To build models that work in a satisfactory way, we need to find intuitive ways to assess them, and understand how these metrics can bias our view.

Receiver operator characteristic curves are a powerful way to assess and fine-tune trained classification models. We introduce and explain the utility of these curves through learning content and practical exercises.

Classification is a kind of machine learning used to categorize items into classes.

Clustering is a type of machine learning that is used to group similar items into clusters.

Deep learning is an advanced form of machine learning that emulates the way the human brain learns through networks of connected neurons.