dataflow failed in prod but sucessed in dev. for moving with json code

Question

dataflow failed in prod but sucessed in dev. for moving with json code

Karnati,Venkata Suchendra Reddy,IN-Bangalore 6

Hi team

Dataflow : purpose converting Json into CSV flattern and push to data lake gen2 from jsoin filke whicj is there in datalake gen 2

when i tested in dev dataflow working fine, but unable to move in prod and test with json code movement since my adf dont have git.

i have tested dataflow in dev its working properly. i have recreated dataset for source and sink in prod and copy paste json code, after that i am getting error as shown in attachment. how can i solve as early as possible.

for dev i can preview data as well for sake of u r undestanding i have attached a screenshort.

kindly help to fix as soon as posible. its impacting huge.

2 answers

Your answer

Answer 1

Vaibhav Chaudhari 38,916 Volunteer Moderator

In the first screenshot, error says that you have used Self hosted IR in linked services. Is that the case? Did you create these linked services by copying the JSON code from Dev Linked service? Can you paste the Json code behind both the linked services?

I guess, you need to remove the self hosted IR and use Azure IR instead, and test the linked service connection.

Karnati,Venkata Suchendra Reddy,IN-Bangalore 6 Reputation points

2022-02-07T09:56:22.01+00:00

Hi anonymous userChaudhari

i have created self hosted in dev and i have used same in dev for getting flattern.

can u tell me why it worked in dev why not in prod not allowing me with same self hosted
Vaibhav Chaudhari 38,916 Reputation points Volunteer Moderator

2022-02-07T10:22:26.187+00:00

In Dev ADF, Can you cross check carefully the Linked service that is being used has the self hosted IR?

Optionally to get unblocked, you can try using Azure IR in Prod ADF and see if linked service connection is successful and dataset returns the data as expected
Vaibhav Chaudhari 38,916 Reputation points Volunteer Moderator

2022-02-07T10:23:20.313+00:00

Also, can you clarify why self hosted IR is created and being used to connect to ADLS Gen2 via Linked service?
Karnati,Venkata Suchendra Reddy,IN-Bangalore 6 Reputation points

2022-02-07T10:30:41.217+00:00

self hosted used for adls for sake of increase computation
Karnati,Venkata Suchendra Reddy,IN-Bangalore 6 Reputation points

2022-02-07T10:32:25.55+00:00

in dev self ir only used and its worked.

i am checking with auto ir in prod and tested its tests connection is succeeded. i need to validate fo rdata.

can u confirm is there anay issues will come in prod if i use auto in future .
Vaibhav Chaudhari 38,916 Reputation points Volunteer Moderator

2022-02-07T11:03:34.987+00:00

There won't be any issue if you use Azure IR to connect to ADLS Gen2.

In fact, documentation & the below thread says - in data flow, Self hosted IR can't be used when LS is using it

https://learn.microsoft.com/en-us/azure/data-factory/concepts-integration-runtime#integration-runtime-types

https://stackoverflow.com/questions/64105436/linked-service-with-self-hosted-integration-runtime-is-not-supported-in-data-flo

Answer 2

Hello @Karnati,Venkata Suchendra Reddy,IN-Bangalore ,

Thanks for the question and using MS Q&A platform.

My understanding is that you are receiving above validation errors, while using SHIR in your Mapping Data flow dataset linked service. Please correct if I'm wrong.

As called out by anonymous userChaudhari , currently Self Hosted Integration runtime is not supported in Mapping Data flows which is why you are receiving errors mentioned in your post.

To overcome this issue, you will need to replace the SHIR with Azure IR in your mapping data flows.

The strange thing I would like to reconfirm is, in your DEV Datafactory -

Could you please re-confirm if you are using a Dataset with SHIR in your Mapping dataflow activity?
If yes, would request you to share a screenshot of the data set with SHIR configured and the mapping dataflow in which that dataset (using SHIR) is being referenced/used? so that I can pass this information to respective product owners.

The reason would like clarification on this is because the public documentation clearly states that Data Flow activities are executed on their associated Azure integration runtime. The Spark compute utilized by Data Flows are determined by the data flow properties in your Azure IR and are fully managed by the service.

As called out by Vaibhav, the Integration Runtime documentation also calls out the same:

But to unblock, you will have to use Azure IR and there should be any issue in using Azure IR as it is the recommended IR to be used with Mapping data flows.

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Share via

dataflow failed in prod but sucessed in dev. for moving with json code

2 answers

Your answer