Best practices to manage a Synapse Serverless SQL pool database

pmscorca 1,052 Reputation points
2024-02-20T21:19:52.3+00:00

Hi,

in order to have a good and repeteable practice about the managing a Synapse Serverless SQL pool db, to apply for more projects, I think at these points:

  1. for creating a database --> CREATE DATABASE myserverlesssqlpool;
  2. for creating a master key for the db --> CREATE MASTER KEY (without any passwords? or specifying a password, but where does it save in a secure manner?);
  3. for creating a db credential --> CREATE DATABASE SCOPED CREDENTIAL my_cred_synapse WITH IDENTITY = 'MANAGED IDENTITY' (I prefer this setting for IDENTITY);

and so on.

Any suggests, please? Thanks

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,378 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Debarchan Sarkar - MSFT 1,131 Reputation points Microsoft Employee
    2024-02-21T01:44:45.0066667+00:00

    Creating a database in Synapse Serverless SQL pool is different from traditional SQL Server since the databases are actually stored in Azure storage (Data Lake or Blob Storage), and you don't use CREATE DATABASE syntax. In Synapse, you will directly query against those data stores. Here are some best practices for managing Synapse Serverless SQL pools: Access Data: Use OPENROWSET to access your data stored in Azure storage services. You can create an external table as well, but with serverless SQL pool, it's not necessary.

    
    SELECT * 
    
    FROM OPENROWSET(
    
        BULK 'https://myaccount.dfs.core.windows.net/myfilesystem/myfile.csv',
    
        FORMAT = 'CSV'
    
    ) AS result
    
    

    Authentication: For authenticating, you typically use Managed Identity. Managed identities provide an Azure Active Directory identity for your Synapse workspace, simplifying password management.

    
    CREATE DATABASE SCOPED CREDENTIAL [MyManagedIdentity]
    
    WITH IDENTITY = 'Managed Identity';
    
    

    Data Lake Firewall: If your Data Lake has firewall enabled, remember to add your Synapse workspace's managed private endpoint to the firewall allow list. Parquet Format: When dealing with large datasets, consider using Parquet format which provides both a performance boost and cost savings. Use Views: To simplify access to multiple files or to hide complexity of your data lake, use views. They are easy to manage and can be secured using ACLs. Optimize File Size: When writing data back to the data lake, try to optimize the size of output files. Aim for larger files (between 256 MB and 1 GB) to improve read performance. Securing Data: Sensitive data should be secured. Consider using static data masking or dynamic data masking to hide sensitive data. Always encrypt data at rest and in transit. Cost Management: To manage costs, monitor and analyse your usage regularly. Also, set up alerts for when you reach certain budget thresholds.

    Remember that these are general guidelines and the specifics may vary depending on the details of your project. Let me know if you need more information on this!


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.