Events
31 Mar, 23 - 02 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
A serverless SQL pool query reads files directly from Azure Storage. Permissions to access the files on Azure storage are controlled at two levels:
OPENROWSET
function. Read more about the required permissions in this section.This article describes the types of credentials you can use and how credential lookup is enacted for SQL and Microsoft Entra users.
A serverless SQL pool in Synapse Analytics workspace can read the content of files stored in Azure Data Lake storage. You need to configure permissions on storage to enable a user who executes a SQL query to read the files. There are three methods for enabling the access to the files:
As an alternative, you can make your files publicly available by allowing anonymous access. This approach should NOT be used if you have nonpublic data.
A user that has logged into a serverless SQL pool must be authorized to access and query the files in Azure Storage if the files aren't publicly available. You can use four authorization types to access nonpublic storage: user identity, shared access signature, service principal, and managed identity.
Note
Microsoft Entra pass-through is the default behavior when you create a workspace.
A service principal is the local representation of a global application object in a particular Microsoft Entra tenant. This authentication method is appropriate in cases where storage access is to be authorized for a user application, service, or automation tool. For more information on service principals in Microsoft Entra ID, see Application and service principal objects in Microsoft Entra ID.
The application needs to be registered in Microsoft Entra ID. For more information on the registration process, follow Quickstart: Register an application with the Microsoft identity platform. Once the application is registered, its service principal can be used for authorization.
The service principal should be assigned to the Storage Blob Data Owner, Storage Blob Data Contributor, and Storage Blob Data Reader roles in order for the application to access the data. Even if the service principal is the Owner of a storage account, it still needs to be granted an appropriate Storage Blob Data role. As an alternative way of granting access to storage files and folders, fine-grained ACL rules for service principal can be defined.
To learn more about access control in Azure Data Lake Store Gen2, review Access control in Azure Data Lake Storage Gen2.
In cases when Azure Storage is in a different tenant from the Synapse serverless SQL pool, authorization via Service Principal is the recommended method. SAS authorization is also possible, while Managed Identity is not supported.
Authorization Type | Firewall protected storage | non-Firewall protected storage |
---|---|---|
SAS | Supported | Supported |
Service Principal | Not Supported | Supported |
Note
If Azure Storage is protected by an Azure Storage firewall, Service Principal will not be supported.
The following table provides available Azure Storage authorization types for different sign-in methods into an Azure Synapse Analytics serverless SQL endpoint:
Authorization type | SQL user | Microsoft Entra user | Service principal |
---|---|---|---|
User Identity | Not Supported | Supported | Supported |
SAS | Supported | Supported | Supported |
Service principal | Supported | Supported | Supported |
Managed Identity | Supported | Supported | Supported |
You can use the following combinations of authorization types and Azure Storage types:
Authorization type | Blob Storage | ADLS Gen1 | ADLS Gen2 |
---|---|---|---|
SAS | Supported | Not supported | Supported |
Service principal | Supported | Supported | Supported |
Managed Identity | Supported | Supported | Supported |
User Identity | Supported | Supported | Supported |
In cases when Azure Storage is in a different tenant from the Azure Synapse Analytics serverless SQL pool, authorization via service principal is the recommended method. Shared access signature authorization is also possible. Managed service identity is not supported.
Authorization Type | Firewall protected storage | non-Firewall protected storage |
---|---|---|
SAS | Supported | Supported |
Service principal | Not Supported | Supported |
Note
If Azure Storage is protected by an Azure Storage firewall and is in another tenant, service principal will not be supported. Instead, use a shared access signature (SAS).
You can configure storage accounts to allow access to a specific serverless SQL pool by creating a resource instance rule. When accessing storage that is protected with the firewall, use User Identity or Managed Identity.
Note
The firewall feature on Azure Storage is in public preview and is available in all public cloud regions.
The following table provides available firewall-protected Azure Storage authorization types for different sign-in methods into an Azure Synapse Analytics serverless SQL endpoint:
Authorization type | SQL user | Microsoft Entra user | Service principal |
---|---|---|---|
User Identity | Not Supported | Supported | Supported |
SAS | Not Supported | Not Supported | Not Supported |
Service principal | Not Supported | Not Supported | Not Supported |
Managed Identity | Supported | Supported | Supported |
Service principal cannot be used to access firewall-protected storage. Use a managed service identity instead.
To query a file located in Azure Storage, your serverless SQL pool endpoint needs a credential that contains the authentication information. Two types of credentials are used:
OPENROWSET
function. The credential name must match the storage URL.DATA SOURCE
with the credential that should be used to access storage.To grant the ability manage credentials:
To allow a user to create or drop a server-level credential, an administrator must grant the ALTER ANY CREDENTIAL
permission to its login in the master database. For example:
GRANT ALTER ANY CREDENTIAL TO [login_name];
To allow a user to create or drop a database scoped credential, an administrator must grant the CONTROL
permission on the database to the database user in the user database. For example:
GRANT CONTROL ON DATABASE::[database_name] TO [user_name];
Database users who access external storage must have permission to use credentials. To use the credential, a user must have the REFERENCES
permission on a specific credential.
To grant the REFERENCES
permission on a server-level credential for a login, use the following T-SQL query in the master database:
GRANT REFERENCES ON CREDENTIAL::[server-level_credential] TO [login_name];
To grant a REFERENCES
permission on a database-scoped credential for a database user, use the following T-SQL query in the user database:
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[database-scoped_credential] TO [user_name];
Server-level credentials are used when a SQL login calls OPENROWSET
function without a DATA_SOURCE
to read files on a storage account.
The name of server-level credential must match the base URL of Azure storage, optionally followed by a container name. A credential is added by running CREATE CREDENTIAL. You must provide the CREDENTIAL NAME
argument.
Note
The FOR CRYPTOGRAPHIC PROVIDER
argument is not supported.
Server-level CREDENTIAL name must match the following format: <prefix>://<storage_account_path>[/<container_name>]
. Storage account paths are described in the following table:
External Data Source | Prefix | Storage account path |
---|---|---|
Azure Blob Storage | https |
<storage_account>.blob.core.windows.net |
Azure Data Lake Storage Gen1 | https |
<storage_account>.azuredatalakestore.net/webhdfs/v1 |
Azure Data Lake Storage Gen2 | https |
<storage_account>.dfs.core.windows.net |
Server-level credentials are then able to access Azure storage using the following authentication types:
The following script creates a server-level credential that can be used to access files in a storage using Service principal for authentication and authorization. AppID can be found by visiting App registrations in Azure portal and selecting the App requesting storage access. Secret is obtained during the App registration. AuthorityUrl is URL of Microsoft Entra ID OAuth2.0 authority.
CREATE CREDENTIAL [https://<storage_account>.dfs.core.windows.net/<container>]
WITH IDENTITY = '<AppID>@<AuthorityUrl>'
, SECRET = '<Secret>'
Database-scoped credentials are used when any principal calls OPENROWSET
function with DATA_SOURCE
or selects data from external table that don't access public files. The database scoped credential doesn't need to match the name of storage account, it is referenced in DATA SOURCE that defines the location of storage.
Database-scoped credentials enable access to Azure storage using the following authentication types:
The following script creates a database-scoped credential that can be used to access files in a storage using service principal for authentication and authorization. AppID can be found by visiting App registrations in Azure portal and selecting the App requesting storage access. Secret is obtained during the App registration. AuthorityUrl is URL of Microsoft Entra ID OAuth2.0 authority.
-- Optional: Create MASTER KEY if not exists in database:
-- CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Very Strong Password>
CREATE DATABASE SCOPED CREDENTIAL [<CredentialName>] WITH
IDENTITY = '<AppID>@<AuthorityUrl>'
, SECRET = '<Secret>'
GO
CREATE EXTERNAL DATA SOURCE MyDataSource
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>',
CREDENTIAL = CredentialName
)
Database scoped credentials are used in external data sources to specify what authentication method will be used to access this storage:
CREATE EXTERNAL DATA SOURCE mysample
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>',
CREDENTIAL = <name of database scoped credential>
)
Use the following script to create a table that accesses publicly available data source.
CREATE EXTERNAL FILE FORMAT [SynapseParquetFormat]
WITH ( FORMAT_TYPE = PARQUET)
GO
CREATE EXTERNAL DATA SOURCE publicData
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<public_container>/<path>' )
GO
CREATE EXTERNAL TABLE dbo.userPublicData ( [id] int, [first_name] varchar(8000), [last_name] varchar(8000) )
WITH ( LOCATION = 'parquet/user-data/*.parquet',
DATA_SOURCE = [publicData],
FILE_FORMAT = [SynapseParquetFormat] )
Database user can read the content of the files from the data source using external table or OPENROWSET function that references the data source:
SELECT TOP 10 * FROM dbo.userPublicData;
GO
SELECT TOP 10 * FROM OPENROWSET(BULK 'parquet/user-data/*.parquet',
DATA_SOURCE = 'mysample',
FORMAT='PARQUET') as rows;
GO
Modify the following script to create an external table that accesses Azure storage using SAS token, Microsoft Entra identity of user, or managed identity of workspace.
-- Create master key in databases with some password (one-off per database)
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<strong password>'
GO
-- Create databases scoped credential that use Managed Identity, SAS token or service principal. User needs to create only database-scoped credentials that should be used to access data source:
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity'
GO
CREATE DATABASE SCOPED CREDENTIAL SasCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE', SECRET = 'sv=2019-10-1********ZVsTOL0ltEGhf54N8KhDCRfLRI%3D'
GO
CREATE DATABASE SCOPED CREDENTIAL SPNCredential WITH
IDENTITY = '**44e*****8f6-ag44-1890-34u4-22r23r771098@https://login.microsoftonline.com/**do99dd-87f3-33da-33gf-3d3rh133ee33/oauth2/token'
, SECRET = '.7OaaU_454azar9WWzLL.Ea9ePPZWzQee~'
GO
-- Create data source that one of the credentials above, external file format, and external tables that reference this data source and file format:
CREATE EXTERNAL FILE FORMAT [SynapseParquetFormat] WITH ( FORMAT_TYPE = PARQUET)
GO
CREATE EXTERNAL DATA SOURCE mysample
WITH ( LOCATION = 'https://<storage_account>.dfs.core.windows.net/<container>/<path>'
-- Uncomment one of these options depending on authentication method that you want to use to access data source:
--,CREDENTIAL = WorkspaceIdentity
--,CREDENTIAL = SasCredential
--,CREDENTIAL = SPNCredential
)
CREATE EXTERNAL TABLE dbo.userData ( [id] int, [first_name] varchar(8000), [last_name] varchar(8000) )
WITH ( LOCATION = 'parquet/user-data/*.parquet',
DATA_SOURCE = [mysample],
FILE_FORMAT = [SynapseParquetFormat] );
Database user can read the content of the files from the data source using external table or OPENROWSET function that references the data source:
SELECT TOP 10 * FROM dbo.userdata;
GO
SELECT TOP 10 * FROM OPENROWSET(BULK 'parquet/user-data/*.parquet', DATA_SOURCE = 'mysample', FORMAT='PARQUET') as rows;
GO
These articles help you learn how query different folder types, file types, and create and use views:
Events
31 Mar, 23 - 02 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayTraining
Module
Use Azure Synapse serverless SQL pool to query files in a data lake - Training
Use Azure Synapse serverless SQL pool to query files in a data lake
Certification
Microsoft Certified: Azure Database Administrator Associate - Certifications
Administer an SQL Server database infrastructure for cloud, on-premises and hybrid relational databases using the Microsoft PaaS relational database offerings.
Documentation
Access files on storage in serverless SQL pool - Azure Synapse Analytics
Describes querying storage files using serverless SQL pool in Azure Synapse Analytics.
Serverless SQL pool self-help - Azure Synapse Analytics
This article contains information that can help you troubleshoot problems with serverless SQL pool.
CREATE EXTERNAL TABLE AS SELECT (CETAS) in Synapse SQL - Azure Synapse Analytics
Using CREATE EXTERNAL TABLE AS SELECT (CETAS) with Synapse SQL