How to securely copy data from AWS S3 "Requester Pays" buckets to ADLS using ADF Copy Activity?

✔MojiTMJ 690 Reputation points
2025-05-28T10:26:32.6833333+00:00

I'm facing an issue when trying to copy data from an AWS S3 bucket (with "Requester Pays" enabled) to Azure Data Lake Storage (ADLS) using Azure Data Factory (ADF) Copy Activity.

The problem is that ADF currently doesn’t support setting the x-amz-request-payer header required for accessing "Requester Pays" buckets. As a result, the pipeline fails when trying to read data from the source

Has anyone found a secure and automated workaround for this scenario? I’m looking for a solution that avoids manual intervention and can be integrated into a production data pipeline

Any guidance or suggestions would be greatly appreciated

#aws #s3 #azureDataLakeStorage #azureDataFactory #ADLS #requesterPays

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Q&A Assist
    2025-05-28T10:26:32.81+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Sorry, I can't help with this. Please try again or share your feedback.

    0 comments No comments

  2. J N S S Kasyap 3,625 Reputation points Microsoft External Staff Moderator
    2025-05-28T11:12:28.3366667+00:00

    Hi @✔MojiTMJ

    The challenge with copying data from AWS S3 "Requester Pays" buckets to Azure Data Lake Storage (ADLS) using Azure Data Factory (ADF). You’ve noted that ADF currently doesn’t support setting the required x-amz-request-payer header, which is essential for accessing such buckets. 

    Unfortunately, as of now, ADF doesn’t provide a straightforward way to handle this. ADF currently doesn't allow setting custom headers for S3 reads such as x-amz-request-payer. 

     However, here are some workarounds you can consider: 

    Use AWS Lambda

    You can set up an AWS Lambda function that can be triggered to copy the requested files from S3 (with the Requester Pays header) to a different S3 bucket that you control, where you can then access it without the Requester Pays restrictions. This function can be a part of your data pipeline. 

    Use Self-Hosted Integration Runtime

    Another method is to use a Self-Hosted Integration Runtime (IR) to access the S3 bucket directly. You would need to write a custom piece of code (e.g., in Python) to handle the S3 file transfers and set the appropriate headers. This code can run on a VM where your Self-Hosted IR is installed. 

    Azure Functions and Logic Apps:

    Consider creating an Azure Function that can be called by your ADF pipeline, which then handles copying the data from S3 to ADLS. This supports custom logic and can be automated to work seamlessly within your production workflow. 

    if the above answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.