Query : Data Ingestion methods

Sourav 130 Reputation points
2023-07-18T08:50:51.7133333+00:00

Hello-

We have a requirement to ingest the data from an external third-party which will contain sensitive data. The source database is SQL database on the third-party and the destination is our Azure Storage with private network/endpoint. We need to connect to third-party database and query the required data and copy/bring it to our organization azure storage.

  1. How can we do it ? Could you please provide the steps.
  2. We need to ensure data security - What encryption method we can use and how so that data is encrypted at rest and in-transit.
  3. Can we include data governance (Microsoft Purview)- How ?
  4. If the storage account is not public how can we connect securely with private connectivity.
  5. Is it possible to automate the export of the required data into an excel or csv by the third-party at the source database and sent to our storage using SFTP ?
  6. What are the other options - Like ADF pipeline could be one option but connectivity, encryption, security we need to consider.

Please provide the answer in a way that it gives clear picture of the data flow and connectivity methods.

Thanks in advance.

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Vahid Ghafarpour 23,385 Reputation points Volunteer Moderator
    2023-07-18T16:39:08.2433333+00:00

    To ingest data from an external third-party SQL database into your Azure Storage with a private network/endpoint, you can follow these steps suggested by ChatGPT:

    Establish Connectivity:

    • Set up a virtual network (VNet) in Azure that will act as a secure communication channel between your organization and the third-party database.
    • Configure a virtual network gateway (VPN Gateway or Azure ExpressRoute) to establish a secure connection between your VNet and the third-party network. This will enable private connectivity.

    Secure Data Transfer:

    • Use Azure Data Factory (ADF) to create a pipeline for data ingestion. ADF provides data integration capabilities and supports secure connectivity to various data sources, including SQL databases.
    • Configure the ADF pipeline to connect to the third-party SQL database using the appropriate connection details (e.g., server name, credentials, database name).
    • Ensure that the ADF pipeline uses secure protocols (e.g., SSL/TLS) to encrypt the data in transit while transferring it from the third-party database to your Azure Storage.

    Data Encryption at Rest:

    • Configure your Azure Storage account to enable server-side encryption. Azure Storage supports encryption at rest using Azure Storage Service Encryption (SSE).
    • With SSE enabled, Azure Storage automatically encrypts your data before persisting it to disk and decrypts it when accessed.

    Data Governance with Microsoft Purview:

    • Microsoft Purview is a data governance solution that helps you discover, understand, and manage your data assets across your organization.
    • You can integrate Microsoft Purview with your Azure Storage to gain visibility and control over the ingested data. Purview provides features like data cataloging, data classification, and lineage tracking.

    Secure Connectivity to Private Storage Account:

    • If your Azure Storage account is not publicly accessible, you can establish private connectivity using one of the following methods:
    • Access from within the VNet: Ensure that the ADF pipeline and other components accessing the storage are deployed within the same VNet.
    • Azure Private Link: Configure a Private Link service for your storage account to enable access over a private endpoint within the VNet.
    • VNet service endpoints: Use VNet service endpoints to secure access to Azure Storage from your VNet without going through the public internet.

    Automated Export via SFTP:

    • To automate the export of data from the third-party database to your Azure Storage, you can consider using SFTP (Secure File Transfer Protocol) as the transfer mechanism.
    • Work with the third-party to set up an SFTP server where they can export the required data in Excel or CSV format.
    • Configure your ADF pipeline to connect to the SFTP server and retrieve the files. The pipeline can then transfer the files to your Azure Storage.

    Alternative Options:

    • ADF Pipeline: As mentioned earlier, ADF provides robust data integration capabilities, making it a suitable option for data ingestion and transformation.
    • Azure Logic Apps: Logic Apps allows you to create workflows for integrating and orchestrating data movement across various systems, including SQL databases and Azure Storage.
    • Azure Synapse Analytics: If you require advanced analytics and data warehousing capabilities, Azure Synapse Analytics provides an integrated environment to ingest, prepare, manage, and serve data for reporting and analytics purposes.

    By following these steps, you can securely ingest data from a third-party SQL database into your Azure Storage while ensuring data encryption, governance, and connectivity with a private network/endpoint.


  2. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2023-08-07T06:10:45.43+00:00

    Hi Sourav ,

    I understand that you are ingesting data from SQL database into Azure Storage account with private network/endpoint and you want to ensure data is secured and encrypted. Please let me know if that is not the ask here.

    Data ingestion: To ingest data from an external third-party SQL database into Azure Storage, you can use Azure Data Factory (ADF) or Azure Synapse Analytics. Both services provide data integration capabilities to move data from various sources to Azure Storage.

    Data Security : To ensure data security, you can use Azure Storage Service Encryption (SSE) to encrypt data at rest. For more details, kindly check the following documentation: Azure Storage encryption for data at rest

    To encrypt data in transit, you can use Azure Private Link for Azure Data Factory and Use private endpoints for Azure Storage

    Data Governance using Microsoft Purview: You can create Automated lineage in Microsoft purview from Azure Data Factory

    Azure Synapse and Azure Purview Integration

    Secure Connectivity with Private Endpoint: How to create azure storage account with private endpoint

    Automating Data Export using SFTP: You can use FTP connector in ADF to create the pipeline and schedule the pipeline using triggers in ADF. Kindly go through the following video: How to Download file from FTP in ADF

    Hope it helps. Let me know if you have any additional queries. In case this answer helped, kindly accept it by clicking on Accept answer button. Thankyou


  3. Sourav 130 Reputation points
    2023-08-07T10:21:10.23+00:00

    Please have a look at my last comment and I want answer to the specific queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.