pipeline fail during the sink of a copy data to parquet

Ruben Dario Reyes Monsalve 41 Reputation points
2021-10-07T20:17:08.02+00:00

I have an issue with a pipeline that works just fine with the same resources (storage account, linked server and integration runtime) in a different ADF (TEST env), in PREPROD, the difference is that the IR is shared as it comes from the TEST env, it fails with the following error:

Operation on target extract_delta_updates_from_source failed: Failure happened on 'Sink' side. ErrorCode=JreNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Java Runtime Environment cannot be found on the Self-hosted Integration Runtime machine. It is required for parsing or writing to Parquet/ORC files. Make sure Java Runtime Environment has been installed on the Self-hosted Integration Runtime machine.,Source=Microsoft.DataTransfer.Common,''Type=System.DllNotFoundException,Message=Unable to load DLL 'jvm.dll': The specified module could not be found. (Exception from HRESULT: 0x8007007E),Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'

Even if it is working in the TEST environment, we double check the minimum requirements

Confirm that your copy activity is using the correct integration runtime in ADF.
Double-check that the IR and JRE match bit-wise (e.g., both 64-bit)
Check that JAVA_HOME is set correctly in the environment variables
Check the registry key – HKEY_LOCAL_MACHINE\Software\JavaSoft\Java Runtime Environment should have a Current Version entry that shows the current JRE version

Any ideas what the issue could be?

Azure Storage
Azure Storage
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,543 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,202 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,657 questions
0 comments No comments
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2021-10-11T01:05:13.263+00:00

    Hi @Ruben Dario Reyes Monsalve ,

    Thanks for using Microsoft Q&A forum and posting your query here.

    Usually this error occurs when JAVA Runtime Environment is not correctly installed in referenced self-hosted IR machine. If you have multiples nodes for your Self Hosted Integration Runtime then I would recommend you to please verify if JRE is installed correctly in all SHIR machines and Environment variable is registered correctly. For more details please see below additional information:

    For copy empowered by Self-hosted Integration Runtime e.g. between on-premises and cloud data stores, if you are not copying Parquet files as-is, you need to install the 64-bit JRE 8 (Java Runtime Environment) or OpenJDK and Microsoft Visual C++ 2010 Redistributable Package on your IR machine.

    For copy running on Self-hosted IR with Parquet file serialization/deserialization, the service locates the Java runtime by firstly checking the registry (SOFTWARE\JavaSoft\Java Runtime Environment\{Current Version}\JavaHome) for JRE, if not found, secondly checking system variable JAVA_HOME for OpenJDK.

    • To use JRE: The 64-bit IR requires 64-bit JRE. You can find it from here.
    • To use OpenJDK: It's supported since IR version 3.13. Package the jvm.dll with all other required assemblies of OpenJDK into Self-hosted IR machine, and set system environment variable JAVA_HOME accordingly.
    • To install Visual C++ 2010 Redistributable Package: Visual C++ 2010 Redistributable Package is not installed with self-hosted IR installations. You can find it from here.

    Reference docs:

    Please make sure all the prerequisites are met in all the SHIR installed machines (nothing but SHIR node machines)

    Hope this info helps. Do let us know if you have further query.

    ----------

    • Please don't forget to click on 130616-image.png and upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.