Access external storage using serverless SQL pool in Azure Synapse Analytics
This article describes how users can read data from the files stored on Azure Storage in serverless SQL pool. Users have the following options to access storage:
- OPENROWSET function that enables ad-hoc queries over the files in Azure Storage.
- External table that is a predefined data structure built on top of set of external files.
User can use different authentication methods such as Microsoft Entra passthrough authentication (default for Microsoft Entra principals) and SAS authentication (default for SQL principals).
Query files using OPENROWSET
OPENROWSET enables users to query external files on Azure storage if they have access to the storage. A user who is connected to serverless SQL pool should use the following query to read the content of the files on Azure storage:
SELECT * FROM
OPENROWSET(BULK 'https://<storage_account>.dfs.core.windows.net/<container>/<path>/*.parquet', format= 'parquet') as rows
User can access storage using the following access rules:
- Microsoft Entra user -
OPENROWSET
will use Microsoft Entra identity of caller to access Azure Storage or access storage with anonymous access. - SQL user –
OPENROWSET
will access storage with anonymous access or can be impersonated using SAS token or Managed identity of workspace.
SQL principals can also use OPENROWSET to directly query files protected with SAS tokens or Managed Identity of the workspace. If a SQL user executes this function, a power user with ALTER ANY CREDENTIAL
permission must create a server-scoped credential that matches URL in the function (using storage name and container) and granted REFERENCES permission for this credential to the caller of OPENROWSET function:
EXECUTE AS somepoweruser
CREATE CREDENTIAL [https://<storage_account>.dfs.core.windows.net/<container>]
WITH IDENTITY = 'SHARED ACCESS SIGNATURE', SECRET = 'sas token';
GRANT REFERENCES ON CREDENTIAL::[https://<storage_account>.dfs.core.windows.net/<container>] TO sqluser
If there's no server-level CREDENTIAL that matches the URL, or the SQL user doesn't have references permission for this credential, the error will be returned. SQL principals can't impersonate using some Microsoft Entra identity.
Note
This version of OPENROWSET is designed for quick-and-easy data exploration using default authentication. To leverage impersonation or Managed Identity, use OPENROWSET with DATA_SOURCE described in the next section.
Query data sources using OPENROWSET
OPENROWSET enables user to query the files placed on some external data source:
SELECT * FROM
OPENROWSET(BULK 'file/path/*.parquet',
DATA_SOURCE = MyAzureInvoices,
FORMAT= 'parquet') as rows
The user that executes this query must be able to access the files. The users must be impersonated using SAS token or Managed Identity of workspace if they can't directly access the files using their Microsoft Entra identity or anonymous access.
DATABASE SCOPED CREDENTIAL
specifies how to access files on the referenced data source (currently SAS and Managed Identity). Power user with CONTROL DATABASE
permission would need to create DATABASE SCOPED CREDENTIAL
that will be used to access storage and EXTERNAL DATA SOURCE
that specifies URL of data source and credential that should be used:
EXECUTE AS somepoweruser;
-- Create MASTER KEY if it doesn't exists in database
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'some very strong password';
CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=201********' ;
CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>/' ,
CREDENTIAL = AccessAzureInvoices) ;
Caller must have one of the following permissions to execute OPENROWSET function:
- One of the permissions to execute OPENROWSET:
ADMINISTER BULK OPERATIONS
enables login to execute OPENROWSET function.ADMINISTER DATABASE BULK OPERATIONS
enables database scoped user to execute OPENROWSET function.
REFERENCES DATABASE SCOPED CREDENTIAL
to the credential that is referenced inEXTERNAL DATA SOURCE
.
EXTERNAL TABLE
User with the permissions to read table can access external files using an EXTERNAL TABLE created on top of set of Azure Storage folders and files.
User that has permissions to create external table (for example CREATE TABLE and ALTER ANY CREDENTIAL or REFERENCES DATABASE SCOPED CREDENTIAL) can use the following script to create a table on top of Azure Storage data source:
CREATE EXTERNAL TABLE [dbo].[DimProductexternal]
( ProductKey int, ProductLabel nvarchar, ProductName nvarchar )
WITH
(
LOCATION='/DimProduct/year=*/month=*' ,
DATA_SOURCE = AzureDataLakeStore ,
FILE_FORMAT = TextFileFormat
) ;
User that reads data from this table must be able to access the files. The users must be impersonated using SAS token or Managed Identity of workspace if they cannot directly access the files using their Microsoft Entra identity or anonymous access.
DATABASE SCOPED CREDENTIAL specifies how to access files on the referenced data source. User with CONTROL DATABASE permission would need to create DATABASE SCOPED CREDENTIAL that will be used to access storage and EXTERNAL DATA SOURCE that specifies URL of data source and credential that should be used:
EXECUTE AS somepoweruser;
-- Create MASTER KEY if it doesn't exists in database
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'some very strong password';
CREATE DATABASE SCOPED CREDENTIAL cred
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=201********' ;
CREATE EXTERNAL DATA SOURCE AzureDataLakeStore
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>' ,
CREDENTIAL = cred
) ;
Read external files with EXTERNAL TABLE
EXTERNAL TABLE enables you to read data from the files that are referenced via data source using standard SQL SELECT statement:
SELECT *
FROM dbo.DimProductsExternal
Caller must have the following permissions to read data:
SELECT
permission ON external tableREFERENCES DATABASE SCOPED CREDENTIAL
permission ifDATA SOURCE
hasCREDENTIAL
Permissions
The following table lists required permissions for the operations listed above.
Query | Required permissions |
---|---|
OPENROWSET(BULK) without datasource | ADMINISTER BULK OPERATIONS , ADMINISTER DATABASE BULK OPERATIONS , or SQL login must have REFERENCES CREDENTIAL::<URL> for SAS-protected storage |
OPENROWSET(BULK) with datasource without credential | ADMINISTER BULK OPERATIONS or ADMINISTER DATABASE BULK OPERATIONS , |
OPENROWSET(BULK) with datasource with credential | REFERENCES DATABASE SCOPED CREDENTIAL and one of ADMINISTER BULK OPERATIONS or ADMINISTER DATABASE BULK OPERATIONS |
CREATE EXTERNAL DATA SOURCE | ALTER ANY EXTERNAL DATA SOURCE and REFERENCES DATABASE SCOPED CREDENTIAL |
CREATE EXTERNAL TABLE | CREATE TABLE , ALTER ANY SCHEMA , ALTER ANY EXTERNAL FILE FORMAT , and ALTER ANY EXTERNAL DATA SOURCE |
SELECT FROM EXTERNAL TABLE | SELECT TABLE and REFERENCES DATABASE SCOPED CREDENTIAL |
CETAS | To create table - CREATE TABLE , ALTER ANY SCHEMA , ALTER ANY DATA SOURCE , and ALTER ANY EXTERNAL FILE FORMAT . To read data: ADMINISTER BULK OPERATIONS or REFERENCES CREDENTIAL or SELECT TABLE per each table/view/function in query + R/W permission on storage |
Next steps
You're now ready to continue on with the following How To articles: