Install PolyBase on Linux

Applies to: SQL Server 2019 (15.x) - Linux

The following steps install PolyBase (mssql-server-polybase and mssql-server-polybase-hadoop) on Linux. PolyBase enables you to run external queries against remote data sources.

Prerequisites

Before you install PolyBase, first install SQL Server. This step configures the keys and repositories that you use when installing the mssql-server-polybase and mssql-server-polybase-hadoop package.

Limitations

The length of the hostname where SQL Server is installed needs to be 15 characters or less.

PolyBase isn't supported on SQL Server 2017 (14.x) for Linux.

Scale-out for PolyBase on Linux is currently unavailable.

Hadoop is no longer supported on SQL Server 2022 (16.x).

Install PolyBase

Install PolyBase for your operating system:

  • Red Hat Enterprise Linux (RHEL)
  • Ubuntu
  • SUSE Linux Enterprise Server (SLES)

Install on RHEL

Applies to: SQL Server 2019 (15.x) and later versions

  1. Download the Microsoft Red Hat repository configuration file.

    For RHEL 7:

    sudo curl -o /etc/yum.repos.d/msprod.repo https://packages.microsoft.com/config/rhel/7/prod.repo
    

    For RHEL 8:

    sudo curl -o /etc/yum.repos.d/msprod.repo https://packages.microsoft.com/config/rhel/8/prod.repo
    

    For RHEL 9:

    sudo curl -o /etc/yum.repos.d/msprod.repo https://packages.microsoft.com/config/rhel/9/prod.repo
    
  2. Use the following command to install the mssql-server-polybase on Red Hat Enterprise Linux.

    sudo yum install -y mssql-server-polybase
    
  3. You're prompted to restart the SQL Server instance. Use the following command to do so.

    sudo systemctl restart mssql-server
    

Note

After installation, you must enable the PolyBase feature.

Install Hadoop on RHEL

Applies to: SQL Server 2019 (15.x)

  1. Use the following command to install the mssql-server-polybase-hadoop.

    sudo yum install -y mssql-server-polybase-hadoop
    

    The PolyBase Hadoop package has dependencies on the following packages:

    • mssql-server
    • mssql-server-polybase
    • mssql-server-extensibility
    • mssql-zulu-jre-11
  2. Installation prompts to restart launchpadd. Use the following command to do so.

    sudo systemctl restart mssql-launchpadd
    

Note

After installation, you must set the Hadoop connectivity level.

If you need an offline installation, locate the PolyBase package download in the Release notes for SQL Server 2019 on Linux. Then use the same offline installation steps described in the article Install SQL Server.

Enable PolyBase

After installation, PolyBase must be enabled to access its features. Connect to the installed SQL Server instance and use the following Transact-SQL command to enable.

exec sp_configure @configname = 'polybase enabled', @configvalue = 1;
RECONFIGURE WITH OVERRIDE;

Update PolyBase

If you already have mssql-server-polybase installed, you can update to the latest version with the following commands:

RHEL with Hadoop

Applies to: SQL Server 2019 (15.x)

sudo yum remove -y mssql-server-polybase-hadoop
sudo yum remove -y mssql-server-polybase
sudo yum check-update
sudo yum install -y mssql-server-polybase
sudo yum install -y mssql-server-polybase-hadoop

RHEL without Hadoop

sudo yum remove -y mssql-server-polybase
sudo yum check-update
sudo yum install -y mssql-server-polybase

You're prompted to restart the SQL Server instance. Use the following command to do so.

sudo systemctl restart mssql-server

Note

After installation, you must enable the PolyBase feature.

PolyBase on Linux can access the following data sources. Follow the provided links for more information on how to create an external table from these sources on PolyBase is enabled.