Hi @Vinodh247
Thanks for posting your query!
As I understand that you are experiencing random timeouts in a Lookup Activity within Azure Data Factory when querying an on-prem SQL database.
It appears that the issue might not be related to Integration Runtime (IR) resource constraints, given the low CPU utilization. Instead, the random timeouts could be due to network instability, database performance issues, or execution plan changes.
Query Analysis
Success Query - SELECT [CubeList].[CubeName] ... WHERE [CubeList].[Available] = 1. Timeout: 02:00:00 (2 hours)
Failure Query - SELECT [CubeList].[CubeName] ... WHERE [CubeList].[Available] = 1. Same structure, but potentially disturbed by underlying causes (as details on specific errors are missing).
Performance Observations
Integration Runtime - Type: Self-hosted, Memory: 38GB, CPU Utilization: Below 4% - implies no CPU bottlenecks.
Execution Metrics - Duration for successful activity, approximately 0.016 hours. Duration for failed activity, not specified, but indicates it may have exceeded the timeout setting.
Here are few troubleshooting steps that might help you:
Check SQL Server Performance & Locking - Run the same query manually in SSMS and monitor execution time. Use sp_who2 or sys.dm_exec_requests to check for blocking queries during failures. Review the SQL Execution Plan to ensure no sudden changes are causing delays.
Enable Detailed Logging -
- In ADF - Enable diagnostic logs and check activity run history.
- In SQL Server - Use Extended Events or SQL Profiler to capture slow queries.
Monitor Network Stability - Run a continuous ping test from the IR machine to the SQL Server to check for packet loss. If using a VPN or firewall, check logs for intermittent disconnections.
Increase Timeout & Optimize Query - Try temporarily increasing the timeout in the Lookup activity. Ensure indexes exist for queried columns and run UPDATE STATISTICS to refresh them.
Check IR Logs for Failures - Look at Self-hosted IR logs to see if there are connection retries or timeout-related errors. If possible, add an additional IR node to test if the issue persists.
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.
Thank you.