question

kkc8795 avatar image
0 Votes"
kkc8795 asked KranthiPakala-MSFT commented

data factory can't access cosmos db via private endpoint

I'm setting up data factory to move data from cosmos db to blob using data flow. For security reasons, cosmos db only allows access via private endpoints. Managed private endpoint is created in df and is approved to access cosmos db. The connection works with "Test Connection" and I'm able to preview data in different steps of the data flow. However, when I debug or trigger the data flow activity in a pipeline, an error tells me it's trying to access CosmosDB from a public IP and can't access through the firewall. I don't understand why it's public ip even though I'm using private endpoint to access cosmos db. What's interesting is this issue only occurs when I use data flow. The pipeline works fine if I replace the data flow activity with copy activity (using same datasets). Someone posted a similar question on so: https://stackoverflow.com/questions/69630596/data-factory-endpoint-trouble

Any ideas what't going wrong?

azure-data-factory
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @kkc8795,

Thanks for the question and using MS Q&A platform.

Could you please share the complete error message that your seeing? If possible, please do share the failed pipeline and failed activity runID .

Also could you please confirm if you are using Default Integration Runtime? If yes, then in order to use managed vnet and private endpoint, please enable the managed Vnet IR in the data flow pipeline and chose the managed vnet IR for the pipeline run. For more info please refer to the doc: Managed virtual network & managed private endpoints - Azure Data Factory | Microsoft Docs.

Hope this info helps. Let us know how it goes.

Thank you


0 Votes 0 ·
kkc8795 avatar image kkc8795 KranthiPakala-MSFT ·

199422-image.png



@KranthiPakala-MSFT I pasted the error above. The pipeline runID is e2e8aafa-5302-465c-886d-09b01566b90e, activity run id a129ad7a-d6f4-404b-8f37-8239f2c1670b
I created a new IR with Managed Virtual Network and this IR is chosen to run the data flow. I verified that by checking Settings > Run on (Azure IR) when clicking on Data flow.

Any help is appreciated.

0 Votes 0 ·
image.png (102.2 KiB)

0 Answers