Configure PolyBase to access external data in SQL Server
Applies to: SQL Server
This article explains how to use PolyBase on a SQL Server instance to query external data in another SQL Server instance.
Prerequisites
If you haven't installed PolyBase, see PolyBase installation. The installation article explains the prerequisites. Once installed, also be sure to enable PolyBase.
The SQL Server external data source uses SQL Authentication.
Before creating a database scoped credential a Master Key must be created.
Configure a SQL Server external data source
To query the data from a SQL Server data source, you must create external tables to reference the external data. This section provides sample code to create these external tables.
For optimal query performance, create statistics on external table columns, especially for the ones used for joins, filters, and aggregates.
The following Transact-SQL commands are used in this section:
- CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)
- CREATE EXTERNAL DATA SOURCE (Transact-SQL)
- CREATE EXTERNAL TABLE (Transact-SQL)
- CREATE STATISTICS (Transact-SQL)
Create a database scoped credential for accessing the SQL Server source. The following example creates a credential to the external data source with
IDENTITY = 'username'
andSECRET = 'password'
.CREATE DATABASE SCOPED CREDENTIAL SqlServerCredentials WITH IDENTITY = 'username', SECRET = 'password';
Important
The SQL ODBC Connector for PolyBase only supports basic authentication, not Kerberos authentication.
Create an external data source with CREATE EXTERNAL DATA SOURCE. The following example:
- Creates an external data source named
SQLServerInstance
. - Identifies the external data source (
LOCATION = '<vendor>://<server>[:<port>]'
). In the example it points to a default instance of SQL Server. - Identifies whether computation should be pushed to the source (
PUSHDOWN
).PUSHDOWN
isON
by default.
Finally, the example uses the credential created previously.
CREATE EXTERNAL DATA SOURCE SQLServerInstance WITH ( LOCATION = 'sqlserver://SqlServer', PUSHDOWN = ON, CREDENTIAL = SQLServerCredentials);
- Creates an external data source named
Create the external table with CREATE EXTERNAL TABLE The statement requires collation and the location requires three-part notation (
<database>.<schema>.<table>
).CREATE EXTERNAL TABLE DatabasesExternal ( name VARCHAR(128) COLLATE SQL_Latin1_General_CP1_CI_AS) WITH (LOCATION = 'master.sys.databases', DATA_SOURCE = SQLServerInstance);
Optionally, create statistics on an external table.
For optimal query performance, create statistics on external table columns, especially the ones used for joins filters and aggregates.
CREATE STATISTICS statistics_name ON customer (C_CUSTKEY)
WITH FULLSCAN;
Important
Once you have created an external data source, you can use the CREATE EXTERNAL TABLE command to create a queryable table over that source.
SQL Server connector compatible types
You can also utilize the SQL Server (sqlserver://) connector to access Azure SQL Database. To accomplish this task, follow the same steps listed previously. Make sure the database scoped credential, server address, port, and location string correlate to that of the Azure SQL Database data source you want to connect to.
Next steps
For more tutorials on creating external data sources and external tables to a variety of data sources, see PolyBase Transact-SQL reference.
To learn more about PolyBase, see Overview of SQL Server PolyBase.