Is there any airflow operator for transferring the files from GCS to Azure Blob storage

Nandini T 0 Reputation points
2024-01-30T12:14:38.03+00:00

I want to transfer a file from GCS bucket to Azure Blob storage via Apache Airflow. Any airflow operator is there for that. Any suggestions would be helpful.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,201 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sumarigo-MSFT 47,471 Reputation points Microsoft Employee Moderator
    2024-01-30T14:57:51.5633333+00:00

    @Nandini T Welcome to Microsoft Q&A Forum, Thank yo for posting your query here! Apache Airflow is open source tool where it is possible to copy file from GCS to Azure Blob: https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/operators/azure_blob_to_gcs.html In Azure they can use ADF: https://learn.microsoft.com/en-us/azure/data-factory/connector-google-cloud-storage?tabs=data-factory

    Apache Airflow Microsoft Operators : https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/operators/index.html

    Google Transfer Operators: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/transfer/index.html

    Additional information: Copy data from Google Cloud Storage to Azure Storage by using AzCopy

    I haven't repro the below code you can give try & let me know the status.

    Here is an example of how to use this operator:
    
    from airflow.contrib.operators.gcs_to_azure_blob_storage_transfer_operator import GCSToAzureBlobStorageTransferOperator
    
    transfer_file = GCSToAzureBlobStorageTransferOperator(
        task_id='transfer_file',
        src_bucket='my-gcs-bucket',
        src_object='path/to/my/file',
        dst_container='my-azure-container',
        dst_blob='path/to/my/file',
        azure_conn_id='my_azure_connection',
        google_cloud_storage_conn_id='my_gcs_connection'
    )
    In this example, the GCSToAzureBlobStorageTransferOperator operator is used to transfer a file from the my-gcs-bucket bucket in GCS to the my-azure-container container in Azure Blob storage. The src_object parameter specifies the path to the file in GCS, and the dst_blob parameter specifies the path to the file in Azure Blob storage.
    
    You will need to provide the appropriate connection IDs for the azure_conn_id and google_cloud_storage_conn_id parameters. These connections should be set up in the Airflow Connections UI and should contain the necessary credentials for accessing GCS and Azure Blob storage
    

    If the issue persist, please let me know I would like to work closer on this issue.

    ---Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.