Perform an offline deployment of a SQL Server big data cluster
Important
The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.
This article describes how to perform an offline deployment of a SQL Server 2019 Big Data Clusters. Big data clusters must have access to a Docker repository from which to pull container images. An offline installation is one where the required images are placed into a private Docker repository. That private repository is then used as the image source for a new deployment.
Prerequisites
- Docker Engine on any supported Linux distribution or Docker for Mac/Windows. Validate the engine version against the tested configurations on the SQL Server Big Data Clusters release notes.For more information, see Install Docker.
Warning
The parameter imagePullPolicy
is required to be set as "Always"
in the deployment profile control.json file.
Load images into a private repository
The following steps describe how to pull the big data cluster container images from the Microsoft repository and then push them into your private repository.
Tip
The following steps explain the process. However, to simplify the task, you can use the automated script instead of manually running these commands.
Pull the big data cluster container images by repeating the following command. Replace
<SOURCE_IMAGE_NAME>
with each image name. Replace<SOURCE_DOCKER_TAG>
with the tag for the big data cluster release, such as 2019-CU12-ubuntu-20.04.docker pull mcr.microsoft.com/mssql/bdc/<SOURCE_IMAGE_NAME>:<SOURCE_DOCKER_TAG>
Log in to the target private Docker registry.
docker login <TARGET_DOCKER_REGISTRY> -u <TARGET_DOCKER_USERNAME> -p <TARGET_DOCKER_PASSWORD>
Tag the local images with the following command for each image:
docker tag mcr.microsoft.com/mssql/bdc/<SOURCE_IMAGE_NAME>:<SOURCE_DOCKER_TAG> <TARGET_DOCKER_REGISTRY>/<TARGET_DOCKER_REPOSITORY>/<SOURCE_IMAGE_NAME>:<TARGET_DOCKER_TAG>
Push the local images to the private Docker repository:
docker push <TARGET_DOCKER_REGISTRY>/<TARGET_DOCKER_REPOSITORY>/<SOURCE_IMAGE_NAME>:<TARGET_DOCKER_TAG>
Warning
Do not modify the big data cluster images once they are pushed into your private repository. Performing a deployment with modified images will result in an unsupported big data cluster setup.
Big data cluster container images
The following big data cluster container images are required for an offline installation:
- mssql-app-service-proxy
- mssql-control-watchdog
- mssql-controller
- mssql-dns
- mssql-hadoop
- mssql-mleap-serving-runtime
- mssql-mlserver-py-runtime
- mssql-mlserver-r-runtime
- mssql-monitor-collectd
- mssql-monitor-elasticsearch
- mssql-monitor-fluentbit
- mssql-monitor-grafana
- mssql-monitor-influxdb
- mssql-monitor-kibana
- mssql-monitor-telegraf
- mssql-security-knox
- mssql-security-support
- mssql-server-controller
- mssql-server-data
- mssql-ha-operator
- mssql-ha-supervisor
- mssql-service-proxy
- mssql-ssis-app-runtime
Automated script
You can use an automated python script that will automatically pull all required container images and push them into a private repository.
Note
Python is a prerequisite for using the script. For more information about how to install Python, see the Python documentation.
From bash or PowerShell, download the script with curl:
curl -o push-bdc-images-to-custom-private-repo.py "https://raw.githubusercontent.com/Microsoft/sql-server-samples/master/samples/features/sql-big-data-cluster/deployment/offline/push-bdc-images-to-custom-private-repo.py"
Then run the script with one of the following commands:
Windows:
python push-bdc-images-to-custom-private-repo.py
Linux:
sudo python push-bdc-images-to-custom-private-repo.py
Follow the prompts for entering the Microsoft repository and your private repository information. After the script completes, all required images should be located in your private repository.
Follow the instructions here to learn how to customize the
control.json
deployment configuration file to make use of your container registry and repository. Note that you must setDOCKER_USERNAME
andDOCKER_PASSWORD
environment variables before deployment to enable access to your private repository.
Install tools offline
Big data cluster deployments require several tools, including Python, Azure Data CLI (azdata
), and kubectl. Use the following steps to install these tools on an offline server.
Install python offline
On a machine with internet access, download one of the following compressed files containing Python:
Operating system Download Windows https://go.microsoft.com/fwlink/?linkid=2074021 Linux https://go.microsoft.com/fwlink/?linkid=2065975 OSX https://go.microsoft.com/fwlink/?linkid=2065976 Copy the compressed file to the target machine and extract it to a folder of your choice.
For Windows only, run
installLocalPythonPackages.bat
from that folder and pass the full path to the same folder as a parameter.installLocalPythonPackages.bat "C:\python-3.6.6-win-x64-0.0.1-offline\0.0.1"
Install azdata offline
On a machine with internet access and Python, run the following command to download all off the Azure Data CLI (
azdata
) packages to the current folder.pip download -r https://aka.ms/azdata
Copy the downloaded packages and the
requirements.txt
file to the target machine.Run the following command on the target machine, specifying the folder that you copied the previous files into.
pip install --no-index --find-links <path-to-packages> -r <path-to-requirements.txt>
Install kubectl offline
To install kubectl to an offline machine, use the following steps.
Use curl to download kubectl to a folder of your choice. For more information, see Install kubectl binary using curl.
Copy the folder to the target machine.
Deploy from private repository
To deploy from the private repository, use the steps described in the deployment guide, but use a custom deployment configuration file that specifies your private Docker repository information. The following Azure Data CLI (azdata
) commands demonstrate how to change the Docker settings in a custom deployment configuration file named control.json
:
azdata bdc config replace --config-file custom/control.json --json-values "$.spec.docker.repository=<your-docker-repository>"
azdata bdc config replace --config-file custom/control.json --json-values "$.spec.docker.registry=<your-docker-registry>"
azdata bdc config replace --config-file custom/control.json --json-values "$.spec.docker.imageTag=<your-docker-image-tag>"
The deployment prompts you for the docker username and password, or you can specify them in the DOCKER_USERNAME
and DOCKER_PASSWORD
environment variables.
Next steps
For more information about big data cluster deployments, see How to deploy SQL Server Big Data Clusters on Kubernetes.