self hosted integration runtime consuming memory and not releasing it.

Anonymous
2022-07-20T20:42:41.807+00:00

The self hosted integration IR that we are using for copy activity is using all of the memory on the machine on which IR is hosted. It is not releasing memory. I am not aware that it would consume memory of the machine. Doesn't it use only CPU? Can you please confirm if SHIR uses memory and doesn't release it?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,687 questions
{count} votes

2 answers

Sort by: Most helpful
  1. BhargavaGunnam-MSFT 26,911 Reputation points Microsoft Employee
    2022-07-21T03:49:47.947+00:00

    Hello @Anonymous ,

    Welcome to the Microsoft Q&A platform, and thanks for posting your query.

    Are you seeing any errors when you experiencing high memory usage? Here are a few things I want you to check from your end.

    If your memory and CPU are consistently high then it could be due to running more pipelines using self hosted IR. If you see any momentary high memory usage it could be due to a large volume of concurrent activity.

    1) Check the settings for limit concurrent jobs. The max limit is 96.
    2) Check the network throughput
    3) Check if you have the latest version of self hosted IR installed.
    4) If you are using single node for self hosted IR, consider adding 2nd node
    5) Check the resource usage and concurrent activity execution on the IR node. Adjust the internal and trigger time of activity runs to avoid too much execution on a single IR node at the same time.
    6) Look into add more memory

    Here is the general troubleshooting self-hosted integration runtime document.
    222944-image.png
    222913-image.png

    While setting the concurrent connection value, will need to consider the number of CPU cores and the amount of RAM of the machine. If we have more cores and more memory, we can set this value to a high number.
    and we can scale out by increasing the number of nodes. If we increase the number of nodes, the concurrent jobs limit will be the sum of all available nodes.

    for ex: if we add 3 nodes to your SHIR, and set '10' in the 'limit concurrent jobs' on each node, then we will get a maximum of 30 concurrent jobs.

    There is no specific recommend number here. It all depends on workload and cpu/memory of your nodes.

    one more advantage of adding another node is 'high availability'

    Please let me know if you have any further questions.

    3 people found this answer helpful.
    0 comments No comments

  2. Florent Pousserot 6 Reputation points
    2023-01-12T23:34:39.62+00:00

    On azure datafactory pipeline, we have this error :

    Error: Une panne s'est produite côté « Source ». ErrorCode=SystemErrorOutOfMemory,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Échec d'une tâche avec insuffisance de mémoire

    On host of self hosted integration runtime this event is displayed : Microsoft-Windows-Resource-Exhaustion-Detector

    Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: diawp.exe (9352) consumed 2760433664 bytes, diawp.exe (12808) consumed 2667466752 bytes, and diawp.exe (3468) consumed 2654556160 bytes.

    The Adf pipeline has a looping copy activity, with several small files. (max 3MB)

    If we check memory usage, physical memory (working set) is 50%, while virtual memory (commit load) is 100%.

    The azure runtime monitoring does not show this load, only a drop from 18 to 16 GB .. :(

    User's image

    Why are commits higher than working set?

    Why does the process need this memory and not free it more frequently?

    Is it possible to configure the commit/working ratio?

     Is it possible to set a max in order to make the treatment more stable?

    What is the recommended page file configuration? (here defined by the system on the C: partition).

    User's image

    User's image

    0 comments No comments