Hi Jing Ou,
Thank you for reaching out, Microsoft QA. This issue has been resolved.
Starting December 21, transient GetTokenCredential authentication failures in Azure Data Factory caused long-running Scope jobs to retry and experience extended execution times. Although jobs were successfully submitted to ADLA and continued running, ADF failed during token refresh operations. The behavior was caused by a regression, combined with VM unresponsiveness and certificate retrieval issues in the Scope pools.
The Batch team reverted the changes that introduced the regression, and no job run failures have been observed since January 3. The issue is now mitigated. A small ADF deployment is planned to ensure the fix is consistently applied across environments. In rare cases, clusters that were already running at the time of the fix may require a restart to fully benefit from the resolution.
Please let us know if you observe any further issues.