Azure Data Factory: Copy from Blob to Blob Container with Hierarchical Namespace Fails with 409 Conflict

Noman Yaqub 0 Reputation points
2024-10-17T21:22:59.98+00:00

I am trying to copy data from one storage account to another. I have a pipeline with an activity that copies data from source storage account to sink. The pipeline runs successfully and I can see data has been copied.

But, as soon as I try to copy the data to a storage account with hierarchical namespace enabled the same pipeline and activity does not work.

Please note that I have hierarchical namespace enabled on source storage account in both case. My activity definition looks like below.

{
    "source": {
        "type": "BinarySource",
        "storeSettings": {
            "type": "AzureBlobStorageReadSettings",
            "recursive": true,
            "deleteFilesAfterCompletion": false
        },
        "formatSettings": {
            "type": "BinaryReadSettings"
        }
    },
    "sink": {
        "type": "BinarySink",
        "storeSettings": {
            "type": "AzureBlobStorageWriteSettings",
            "copyBehavior": "PreserveHierarchy"
        }
    },
    "enableStaging": false,
    "preserve": [
        "Attributes"
    ]
}

The error response is as below

{
	"dataRead": 7622720,
	"dataWritten": 0,
	"filesRead": 60,
	"filesWritten": 0,
	"sourcePeakConnections": 64,
	"sinkPeakConnections": 11,
	"copyDuration": 67,
	"throughput": 586.363,
	"errors": [
		{
			"Code": 24116,
			"Message": "Failure happened on 'Sink' side. ErrorCode=AzureBlobWriteOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=,Source=,''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Blob write operation on path='database' and blobName='db/f1000' failed with error message: The remote server returned an error: (409) Conflict..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (409) Conflict.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=The requested operation is not allowed in the current state of the entity.\nRequestId:57c45a88-401e-007a-66d8-209b12000000\nTime:2024-10-17T21:09:27.2956505Z,,''Type=System.Net.WebException,Message=The remote server returned an error: (409) Conflict.,Source=System,''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Blob write operation on path='database' and blobName='db/f1007' failed with error message: The remote server returned an error: (409) Conflict..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=The remote server returned an error: (409) Conflict.,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=The requested operation is not allowed in the current state of the entity.\nRequestId:71039799-601e-000f-51d8-20f03e000000\nTime:2024-10-17T21:09:27.3130209Z,,''Type=System.Net.WebException,Message=The remote server returned an error: (409) Conflict.,Source=System,'",
			"EventType": 0,
			"Category": 5,
			"Data": {
				"FailureInitiator": "Sink"
			},
			"MsgId": null,
			"ExceptionType": null,
			"Source": null,
			"StackTrace": null,
			"InnerEventInfos": []
		}
	],
	"effectiveIntegrationRuntime": "my-integration-runtime (East US 2)",
	"usedDataIntegrationUnits": 4,
	"billingReference": {
		"activityType": "DataMovement",
		"billableDuration": [
			{
				"meterType": "ManagedVNetIR",
				"duration": 0.13333333333333333,
				"unit": "DIUHours"
			}
		],
		"totalBillableDuration": [
			{
				"meterType": "AzureIR",
				"duration": 0.13333333333333333,
				"unit": "DIUHours"
			}
		]
	},
	"usedParallelCopies": 32,
	"executionDetails": [
		{
			"source": {
				"type": "AzureBlobStorage"
			},
			"sink": {
				"type": "AzureBlobStorage"
			},
			"status": "Failed",
			"start": "10/17/2024, 11:08:20 PM",
			"duration": 67,
			"usedDataIntegrationUnits": 4,
			"usedParallelCopies": 32,
			"profile": {
				"queue": {
					"status": "Completed",
					"duration": 52
				},
				"transfer": {
					"status": "Completed",
					"duration": 13,
					"details": {
						"listingSource": {
							"type": "AzureBlobStorage",
							"workingDuration": 7
						},
						"readingFromSource": {
							"type": "AzureBlobStorage",
							"workingDuration": 0
						},
						"writingToSink": {
							"type": "AzureBlobStorage",
							"workingDuration": 0
						}
					}
				}
			},
			"detailedDurations": {
				"queuingDuration": 52,
				"transferDuration": 13
			}
		}
	],
	"dataConsistencyVerification": {
		"VerificationResult": "NotVerified"
	},
	"durationInQueue": {
		"integrationRuntimeQueue": 0
	}
}

I cannot understand why Iam getting 409, I dont have any other process beside ADF accessing sink bloc storage account. Am I missing some configuration? Thanks in advance.

Azure Storage
Azure Storage
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,530 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 14,585 Reputation points Microsoft External Staff Moderator
    2024-10-18T00:08:22.2733333+00:00

    Hi @Noman Yaqub

    Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!

    The issue you are encountering with the 409 error when copying data to a storage account with hierarchical namespace enabled (also known as Azure Data Lake Storage Gen2) typically relates to conflicts or misconfigurations specific to the hierarchical namespace.

    To resolve the issue, you can try the following troubleshooting steps which could be beneficial.

    Check for Existing Files or Directories: Manually inspect the destination storage account to ensure that no files or directories with the same names already exist. Use tools like Azure Storage Explorer or the Azure portal to navigate the destination path.

    Verify Permissions: Ensure that the Azure Data Factory managed identity or the service principal has the necessary permissions on the destination storage account. Specifically, it should have Storage Blob Data Contributor or Storage Blob Data Owner roles.

    Check Path Naming and Hierarchical Namespace Constraints: Ensure that the path and file names adhere to the conventions and constraints of hierarchical namespace-enabled storage accounts. Avoid using special characters that might not be supported.

    Review Activity and Integration Runtime Configuration: Ensure that the integration runtime (IR) being used is properly configured and has network access to both the source and destination storage accounts. Check if the activity is using the correct IR, especially if you have multiple integration runtimes.

    Retry the Operation: Sometimes transient issues can cause such errors.

    For additional information, please refer the thread link that discussed on the similar issue: https://learn.microsoft.com/en-us/answers/questions/794394/azcopy-failing-with-error-409-409-the-requested-op?orderBy=Helpful

    I hope this information helps. Please do let us know if you have any further queries.

    Thank you.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.