CREATE REMOTE TABLE AS SELECT (Parallel Data Warehouse)
Applies to:
Analytics Platform System (PDW)
Selects data from a Analytics Platform System (PDW) database and copies that data to a new table in an SMP SQL Server database on a remote server. Analytics Platform System (PDW) uses the appliance, with all the benefits of MPP query processing, to select the data for the remote copy. Use CREATE REMOTE TABLE AS SELECT
for scenarios that require SQL Server functionality.
To configure the remote server, see Remote Table Copy.
Transact-SQL syntax conventions
Syntax
CREATE REMOTE TABLE { database_name.schema_name.table_name | schema_name.table_name | table_name } AT ('<connection_string>')
[ WITH ( BATCH_SIZE = batch_size ) ]
AS <select_statement>
[;]
<connection_string> ::=
Data Source = { IP_address | hostname } [, port ]; User ID = user_name ;Password = strong_password;
<select_statement> ::=
[ WITH <common_table_expression> [ ,...n ] ]
SELECT <select_criteria>
Arguments
database_name
The database to create the remote table in. database_name is a SQL Server database. Default is the default database for the user login on the destination SQL Server instance.
schema_name
The schema for the new table. Default is the default schema for the user login on the destination SQL Server instance.
table_name
The name of the new table. For details on permitted table names, see "Object Naming Rules" in the Analytics Platform System (PDW) product documentation.
The remote table is created as a heap. It does not have check constraints or triggers. The collation of the remote table columns is the same as the collation of the source table columns. This applies to columns of type char, nchar, varchar, and nvarchar.
connection_string
A character string that specifies the Data Source
, User ID
, and Password
parameters for connecting to the remote server and database.
The connection string is a semicolon-delimited list of key and value pairs. Keywords are not case-sensitive. Spaces between key and value pairs are ignored. However, values might be case-sensitive, depending on the data source.
Data Source
The parameter that specifies the name or IP address, and TCP port number for the remote SMP SQL Server.
hostname or IP_address
Name of the remote server computer or the IPv4 address of the remote server. IPv6 addresses are not supported. You can specify a SQL Server named instance in the format Computer_Name\Instance_Name or IP_address\Instance_Name. The server must be remote and therefore cannot be specified as (local).
TCP port number
The TCP port number for the connection. You can specify a TCP port number from 0 to 65535 for an instance of SQL Server that is not listening on the default port 1433. For example: ServerA,1450
or <ip address>,1435
.
Note
We recommend connecting to a remote server by using the IP address. Depending on your network configuration, connecting by using the computer name might require additional steps to use your non-appliance DNS server to resolve the name to the correct server. This step is not necessary when connecting with an IP address. For more information, see Use a DNS Forwarder to Resolve Non-Appliance DNS Names in Analytics Platform System.
user_name
A valid SQL Server authentication login name. Maximum number of characters is 128.
password
The login password. Maximum number of characters is 128.
batch_size
The maximum number of rows per batch. Analytics Platform System (PDW) sends rows in batches to the destination server. Batch_size is a positive integer >= 0. Default is 0.
WITH common_table_expression
Specifies a temporary named result set, known as a common table expression (CTE). For more information, see WITH common_table_expression (Transact-SQL).
SELECT <select_criteria>
The query predicate that specifies which data populate the new remote table. For information on the SELECT statement, see SELECT (Transact-SQL).
Permissions
Requires:
SELECT permission on each object in the SELECT clause.
Requires CREATE TABLE permission on the destination SMP database.
Requires ALTER, INSERT, and SELECT permissions on the destination SMP schema.
Error Handling
If copying data to the remote database fails, Analytics Platform System (PDW) aborts the operation, log an error, and attempt to delete the remote table. Analytics Platform System (PDW) does not guarantee a successful cleanup of the new table.
Limitations
Remote Destination Server:
TCP is the default and only supported protocol for connecting to a remote server.
The destination server must be a nonappliance server.
CREATE REMOTE TABLE
cannot be used to copy data from one appliance to another.The
CREATE REMOTE TABLE
statement only creates new tables. Therefore, the new table cannot already exist. The remote database and schema must already exist.The remote server must have space available to store the data that is transferred from the appliance to the SQL Server remote database.
SELECT statement:
The
ORDER BY
andTOP
clauses are not supported in the select criteria.CREATE REMOTE TABLE
cannot be run inside an active transaction or when the AUTOCOMMIT OFF setting is active for the session.
SET ROWCOUNT (Transact-SQL) has no effect on this statement. To achieve a similar behavior, use TOP (Transact-SQL).
Locking behavior
After you create the remote table, the destination table is not locked until the copy starts. Therefore, it is possible for another process to delete the remote table after it is created and before the copy starts. When this occurs, Analytics Platform System (PDW) generates an error and the copy fails.
Metadata
Use sys.dm_pdw_dms_workers (Transact-SQL) to view the progress of copying the selected data to the remote SMP server. Rows with type PARALLEL_COPY_READER contain this information.
Security
CREATE REMOTE TABLE
uses SQL Server Authentication to connect to the remote SQL Server instance; it does not use Windows Authentication.
The Analytics Platform System (PDW) external facing network must be firewalled, with exception of SQL Server ports, administrative ports, and management ports.
To help prevent accidental data loss or corruption, the user account that is used to copy from the appliance to the destination database should have only the minimum required permissions on the destination database.
Connection settings allow you to connect to the SMP SQL Server instance with SSL protecting user name and password data, but with actual data being sent unencrypted in clear text. When this occurs, a malicious user could intercept the CREATE REMOTE TABLE
statement text, which contains the SQL Server user name and password to sign in to the SMP SQL Server instance. To avoid this risk, use data encryption on the connection to the SMP SQL Server instance.
Examples
A. Create a remote table
This example creates a SQL Server SMP remote table called MyOrdersTable
on database OrderReporting
and schema Orders
. The OrderReporting
database is on a server named SQLA
that listens on the default port 1433. Provider your own <user name>
and <strong password>
.
CREATE REMOTE TABLE OrderReporting.Orders.MyOrdersTable
AT ( 'Data Source = SQLA, 1433; User ID = <user name>; Password = <strong password>;' )
AS SELECT <select_criteria>;
B. Query the sys.dm_pdw_dms_workers DMV for remote table copy status
This query shows how to view copy status for a remote table copy.
SELECT * FROM sys.dm_pdw_dms_workers
WHERE type = 'PARALLEL_COPY_READER';
C. Use a query join hint with CREATE REMOTE TABLE
This query shows the basic syntax for using a query join hint with CREATE REMOTE TABLE
. After the query is submitted to the Control node, SQL Server, running on the Compute nodes, will apply the hash join strategy when generating the SQL Server query plan. For more information on join hints and how to use the OPTION clause, see OPTION clause (Transact-SQL).
USE ssawPDW;
CREATE REMOTE TABLE OrderReporting.Orders.MyOrdersTable
AT ( 'Data Source = SQLA, 1433; User ID = <user name>; Password = <strong password>;' )
AS SELECT T1.* FROM OrderReporting.Orders.MyOrdersTable T1
JOIN OrderReporting.Orders.Customer T2
ON T1.CustomerID=T2.CustomerID OPTION (HASH JOIN);