migrate binary format data from onprem to azure blob .pdf files using adf

jyo mdb 0 Reputation points
2023-11-29T03:56:48.35+00:00

I am migrating data from onprem sql server to azure blob storage,one of the column values is binary data,once we move data to the blob storage it should store .pdf,.excel,.txt file extensions.how would we achive this scenario?I would like to know what are the possibilities we have.

Note:Its only one time move.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,192 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
SQL Server Other
{count} votes

2 answers

Sort by: Most helpful
  1. Q&A Assist
    2023-11-29T03:56:48.4866667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    To migrate binary format data from on-prem SQL Server to Azure Blob Storage using Azure Data Factory (ADF), you can use a Binary dataset in the Copy activity. When using a Binary dataset, the service does not parse file content but treats it as-is. This means that the binary data in your column will be preserved in the migration process. Additionally, you can specify the file extensions you want to store in Azure Blob Storage in the Sink dataset of the Copy activity.

    Here are the high-level steps to achieve this scenario:

    1. Create a Binary dataset in ADF that points to your on-prem SQL Server database.
    2. Create a Sink dataset in ADF that points to your Azure Blob Storage and specifies the file extensions you want to store.
    3. Create a Copy activity in ADF that uses the Binary dataset as the Source and the Sink dataset as the Sink.
    4. Run the Copy activity to migrate the data from on-prem SQL Server to Azure Blob Storage.

    For more information on using Binary datasets in ADF, see the "Binary format in Azure Data Factory and Synapse Analytics" document. For more information on using ADF to migrate data to Azure Blob Storage, see the "Use Azure Data Factory to migrate data from Amazon S3 to Azure Storage" document.


    References:


  2. Anand Prakash Yadav 7,855 Reputation points Microsoft External Staff
    2023-11-29T13:25:15.32+00:00

    Hello jyo mdb,

    Thank you for posting your query here!

    I understand you are trying to migrate binary format data from an on-premises SQL Server to Azure Blob Storage in .pdf format using Azure Data Factory (ADF). You can follow these steps:

    • Configure linked services in ADF for your on-premises SQL Server and Azure Blob Storage.
    • Set up a self-hosted integration runtime on an on-premises machine to connect to your SQL Server.
    • Define datasets in ADF to represent your on-premises SQL Server data and your Azure Blob Storage.
    • Create a new pipeline in ADF that will orchestrate the data movement.
    • Inside the pipeline, add a "Copy Data" activity.
    • Configure the source settings to connect to your on-premises SQL Server dataset.
    • Configure the sink settings to connect to your Azure Blob Storage dataset.

    Reference: https://learn.microsoft.com/en-us/azure/data-factory/tutorial-hybrid-copy-portal

    Please note that the process of transforming binary data to PDF format is not directly handled by the Azure Data Factory (ADF) Copy Data activity.

    If you need to transform binary data into PDF format during the migration, you'll need to incorporate an additional step.
    You may implement an Azure Function that takes the binary data as input and outputs a PDF file. Add an "Azure Function" activity in your ADF pipeline. Configure it to call the Azure Function you created.

    Kindly let us know if you have any further queries. I’m happy to assist you further.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.