Run Python and R scripts with Machine Learning Services on SQL Server 2019 Big Data Clusters
Applies to:
SQL Server 2019 (15.x)
Wichtig
The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.
You can run Python and R scripts on the master instance of SQL Server Big Data Clusters with Machine Learning Services.
Hinweis
You can also run Java code on the master instance of SQL Server Big Data Clusters with the Java Language Extension. Following the steps below will also enable SQL Server Language Extensions.
Machine Learning Services is installed by default on SQL Server 2019 Big Data Clusters and does not require separate installation.
To enable Machine Learning Services, run this statement on the master instance:
EXEC sp_configure 'external scripts enabled', 1
RECONFIGURE WITH OVERRIDE
GO
You are now ready to run Python and R scripts on the master instance of Big Data Clusters. See the quickstarts under Next steps to run your first script.
Hinweis
The configuration setting cannot be set on an availability group listener connection. If Big Data Clusters is deployed with high availability, the set external scripts enabled
on each replica. See Enable on cluster with high availability.
When you Deploy SQL Server Big Data Cluster with high availability, the deployment creates an availability group for the master instance. To enable Machine Learning Services, set external scripts enabled
on each instance of the availability group. For a Big Data Cluster, you need to run sp_configure
on each replica of the SQL Server master instance
The following section describes how to enable external scripts on each instance.
For each replica on the availability group, create a load balancer to allow you to connect to the instance.
kubectl expose pod <pod-name> --port=<connection port number> --name=<load-balancer-name> --type=LoadBalancer -n <kubernetes namespace>
The examples in this article use the following values:
<pod-name>
:master-#
<connection port number>
:1533
<load-balancer-name>
:mymaster-#
<kubernetes namespace>
:mssql-cluster
Update the following script for your environment, and run the commands:
kubectl expose pod master-0 --port=1533 --name=mymaster-0 --type=LoadBalancer -n mssql-cluster
kubectl expose pod master-1 --port=1533 --name=mymaster-1 --type=LoadBalancer -n mssql-cluster
kubectl expose pod master-2 --port=1533 --name=mymaster-2 --type=LoadBalancer -n mssql-cluster
kubectl
returns the following output.
service/mymaster-0 exposed
service/mymaster-1 exposed
service/mymaster-2 exposed
Each load balancer is a master replica endpoint.
Get the IP address for the master replica endpoint.
The following command returns the external IP address for the replica endpoint.
kubectl get services <load-balancer-name> -n <kubernetes namespace>
To get the external IP address for each replica in this scenario, run the following commands:
Bashkubectl get services mymaster-0 -n mssql-cluster kubectl get services mymaster-1 -n mssql-cluster kubectl get services mymaster-2 -n mssql-cluster
Hinweis
It may take a little time before the external IP address is available. Run the preceding script periodically until each endpoint returns an external IP address.
Connect to the master replica endpoint and enable script execution.
Run this statement:
SQLEXEC sp_configure 'external scripts enabled', 1 RECONFIGURE WITH OVERRIDE GO
For example, you can run the preceding command with
sqlcmd
. The following example connects to the master replica endpoint and enables script execution. Update the values in the script with for your environment.Bashsqlcmd -S <IP address>,1533 -U <user name> -P <password> -Q "EXEC sp_configure 'external scripts enabled', 1; RECONFIGURE WITH OVERRIDE;"
Repeat the step for each replica.
The following image demonstrates this process.
You are now ready to run Python and R scripts on the master instance of Big Data Clusters. See the quickstarts under Next steps to run your first script.
On the Kubernetes cluster, delete the endpoint for each replica. The endpoint is exposed in Kubernetes as a load-balancing service.
The following command deletes load-balancing service.
kubectl delete svc <load-balancer-name> -n mssql-cluster
For the examples in this article, run the following commands.
kubectl delete svc mymaster-0 -n mssql-cluster
kubectl delete svc mymaster-1 -n mssql-cluster
kubectl delete svc mymaster-2 -n mssql-cluster