Batch transfer to AWS

Sourav 130 Reputation points
2024-05-26T18:46:10.8933333+00:00

Hello

We need to transfer files from ADLS to AWS (S3 bucket) for a SAS application hosted in third party in batches. How can we transfer file using ADF SFTP and AWS Transfer family ?

how to ensure data is encrypted during transit if we use above method ?

Thanks!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Gowtham CP 6,020 Reputation points Volunteer Moderator
    2024-05-26T19:16:59.34+00:00

    Hello Sourav ,

    Thanks for reaching out in the Microsoft Q&A!

    To efficiently transfer files from Azure Data Lake Storage (ADLS) to an AWS S3 bucket for your SAS application in batches, you have two main methods: Azure Data Factory (ADF) with Copy Activity and Rclone with an S3 adapter. With ADF, you create a pipeline orchestrating the transfer process, configure the copy activity, and schedule batches for transfer, ensuring data encryption via Azure Key Vault integration. Alternatively, Rclone offers a command-line approach, allowing configuration of the S3 adapter and batch transfer commands, with options for encryption at rest and in transit. Choose ADF for managed simplicity or Rclone for granular control, and ensure proper error handling and IAM configuration for security. By following these steps, you can seamlessly and securely transfer files between ADLS and AWS S3 for your SAS application. For more details, refer to the ADF documentation, Rclone installation guide, S3 adapter configuration, Azure Key Vault documentation, and AWS IAM documentation. If you find this helpful, please accept this answer to close the thread. Thanks!


  2. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2024-05-28T09:12:08.2166667+00:00

    Hi @Sourav ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your query here.

    As per my understanding you want to transfer files from ADLS to AWS S3 bucket.

    You can consider the following steps to achieve the above requirement:

    • Create a Python script to transfer file from BLOB to S3 (Script is given at the end of this post).
    • Create an Azure Batch account and configure the batch pool.
    • Create an ADF pipeline with a custom activity, and connect to Azure batch to run the data transfer script.

    For detailed steps, kindly go through the following blog post: https://medium.com/litmus7/file-transfer-from-azure-blob-to-aws-s3-step-by-step-guide-9be4b033b8ea

    Additionally, In case you want to go for the AWS Transfer family approach, you can set up an SFTP server and add a user with an SSH public key, then use that configuration to set up an SFTP connection from ADF that will connect directly to an S3 bucket.
    Kindly watch out the following video : Easy Step by Step Guide for Beginner Setup AWS Transfer Family - SFTP with S3

    Thankyou.

    Hope it helps.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.