Python tutorial: Predict ski rental with linear regression with SQL machine learning
Applies to: SQL Server 2017 (14.x) and later Azure SQL Managed Instance
In this four-part tutorial series, you will use Python and linear regression in SQL Server Machine Learning Services or on SQL Server 2019 Big Data Clusters to predict the number of ski rentals. The tutorial uses a Python notebook in Azure Data Studio.
In this four-part tutorial series, you will use Python and linear regression in SQL Server Machine Learning Services to predict the number of ski rentals. The tutorial uses a Python notebook in Azure Data Studio.
In this four-part tutorial series, you will use Python and linear regression in Azure SQL Managed Instance Machine Learning Services to predict the number of ski rentals. The tutorial uses a Python notebook in Azure Data Studio.
Imagine that you own a ski rental business and you want to predict the number of rentals that you'll have on a future date. This information helps you get your stock, staff, and facilities ready.
In the first part of this series, you'll get set up with the prerequisites. In parts two and three, you'll develop some Python scripts in a notebook to prepare your data and train a machine learning model. Then, in part three, you'll run those Python scripts inside the database using T-SQL stored procedures.
In this article, you'll learn how to:
- Import a sample database
In part two, you'll learn how to load the data from a database into a Python data frame, and prepare the data in Python.
In part three, you'll learn how to train a linear regression model in Python.
In part four, you'll learn how to store the model in a database, and then create stored procedures from the Python scripts you developed in parts two and three. The stored procedures will run on the server to make predictions based on new data.
Prerequisites
- SQL Server Machine Learning Services - To install Machine Learning Services, see the Windows installation guide or the Linux installation guide. You can also enable Machine Learning Services on SQL Server 2019 Big Data Clusters.
- SQL Server Machine Learning Services - To install Machine Learning Services, see the Windows installation guide.
Azure SQL Managed Instance Machine Learning Services - For information, see the Azure SQL Managed Instance Machine Learning Services overview.
SQL Server Management Studio (SSMS) - Use SSMS to restore the sample database to Azure SQL Managed Instance. To download, see SQL Server Management Studio.
Python IDE - This tutorial uses a Python notebook in Azure Data Studio. For more information, see How to use notebooks in Azure Data Studio.
SQL query tool - This tutorial assumes you're using Azure Data Studio.
Additional Python packages - The examples in this tutorial series use the following Python packages that might not be installed by default:
- pandas
- pyodbc
- scikit-learn
To install these packages:
- In your Azure Data Studio notebook, select Manage Packages.
- In the Manage Packages pane, select the Add new tab.
- For each of the following packages, enter the package name, select Search, then select Install.
As an alternative, you can open a Command Prompt, change to the installation path for the version of Python you use in Azure Data Studio (for example,
cd %LocalAppData%\Programs\Python\Python37-32
), then runpip install
for each package.
Restore the sample database
The sample database used in this tutorial has been saved to a .bak
database backup file for you to download and use.
Note
If you are using Machine Learning Services on SQL Server 2019 Big Data Clusters, see how to Restore a database into the big data cluster master instance.
Download the file TutorialDB.bak.
Follow the directions in Restore a database from a backup file in Azure Data Studio, using these details:
- Import from the
TutorialDB.bak
file you downloaded. - Name the target database
TutorialDB
.
- Import from the
You can verify that the restored database exists by querying the
dbo.rental_data
table:USE TutorialDB; SELECT * FROM [dbo].[rental_data];
Download the file TutorialDB.bak.
Follow the directions in Restore a database to Azure SQL Managed Instance in SQL Server Management Studio, using these details:
- Import from the
TutorialDB.bak
file you downloaded. - Name the target database
TutorialDB
.
- Import from the
You can verify that the restored database exists by querying the
dbo.rental_data
table:USE TutorialDB; SELECT * FROM [dbo].[rental_data];
Clean up resources
If you're not going to continue with this tutorial, delete the TutorialDB
database.
Next step
In part one of this tutorial series, you completed these steps:
- Installed the prerequisites
- Import a sample database
To prepare the data from the TutorialDB database, follow part two of this tutorial series: