One of my WebJobs keeps failing with no exit code 1, but no other errors in log. All the others work fine.

Question

One of my WebJobs keeps failing with no exit code 1, but no other errors in log. All the others work fine.

Piercarlo Serena 20

I have an Azure App Service (Web App - Windows), with a bunch of triggered WebJobs, all written in Node.js. The App Service has 3 deployment slots - 1 for production and 2 for testing/staging purposes.

All of them are working as expected, but for some time one in particular, on the staging slot, keeps failing. User's image

No execution errors, just:

[04/02/2025 14:01:09 > a6c961: SYS INFO] Status changed to Failed
[04/02/2025 14:01:09 > a6c961: SYS ERR ] Job failed due to exit code 1

All the others are still working fine, even the same job on the other two slots.

I tried both upscaling and outscaling the App Service, since when this job was triggered, at one point it reaches a CPU level close to 100%, but with no success.

It is not a timeout error since the job fails after 2 mins (usually taking 3-4 mins).

If I launch the code locally on my computer, it works as expected so I cannot be an error on code-level.

Any helps is appreciated. Thanks

Piercarlo Serena 20 Reputation points

2025-04-04T08:04:15.7766667+00:00

Hi, thanks for your reply, but I think there is a misunderstanding. The code is Node.js, not C# or .NET.
Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-04T08:46:44.36+00:00

Hi Piercarlo Serena,

Apologies, and thanks for mentioning again , i have altered the above same comment , please check and let me know if that helps.
Piercarlo Serena 20 Reputation points

2025-04-04T09:07:12.9833333+00:00
Hi Siva, thanks!

I tried your different approaches, in order:

Node version matches throughout the slots (v22)

I tried to run from C:\home\site\wwwroot\app_data\jobs\triggered<JOB> and I got this error:

C:\home\site\wwwroot\node_modules\thread-stream\index.js:531 throw new Error('_flushSync took too long (10s)') ^ Error: _flushSync took too long (10s) at flushSync (C:\home\site\wwwroot\node_modules\thread-stream\index.js:531:13) at ThreadStream.flushSync (C:\home\site\wwwroot\node_modules\thread-stream\index.js:319:5) at autoEnd (C:\home\site\wwwroot\node_modules\pino\lib\transport.js:63:10) at callRefs (C:\home\site\wwwroot\node_modules\on-exit-leak-free\index.js:55:7) at process.onExit (C:\home\site\wwwroot\node_modules\on-exit-leak-free\index.js:39:3) at process.emit (node:events:520:28) at process.exit (node:internal/process/per_thread:183:15) at main (C:\home\site\wwwroot\app_data\jobs\triggered\notifications\app.js:39:11) at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Node.js v22.5.1 Any help? I'll try also to add that snippet of code for uncaught exceptions. I didn't try to redeploy yet. I tried already to upscale and outscale the instance before opening the ticket, but with no success.

Accepted answer

1 additional answer

Your answer

Piercarlo Serena 20 Reputation points

2025-04-04T08:04:15.7766667+00:00

Hi, thanks for your reply, but I think there is a misunderstanding. The code is Node.js, not C# or .NET.
Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-04T08:46:44.36+00:00

Hi Piercarlo Serena,

Apologies, and thanks for mentioning again , i have altered the above same comment , please check and let me know if that helps.
Piercarlo Serena 20 Reputation points

2025-04-04T09:07:12.9833333+00:00

Hi Siva, thanks!

I tried your different approaches, in order:

Node version matches throughout the slots (v22)

I tried to run from C:\home\site\wwwroot\app_data\jobs\triggered<JOB> and I got this error:

C:\home\site\wwwroot\node_modules\thread-stream\index.js:531 throw new Error('_flushSync took too long (10s)') ^ Error: _flushSync took too long (10s) at flushSync (C:\home\site\wwwroot\node_modules\thread-stream\index.js:531:13) at ThreadStream.flushSync (C:\home\site\wwwroot\node_modules\thread-stream\index.js:319:5) at autoEnd (C:\home\site\wwwroot\node_modules\pino\lib\transport.js:63:10) at callRefs (C:\home\site\wwwroot\node_modules\on-exit-leak-free\index.js:55:7) at process.onExit (C:\home\site\wwwroot\node_modules\on-exit-leak-free\index.js:39:3) at process.emit (node:events:520:28) at process.exit (node:internal/process/per_thread:183:15) at main (C:\home\site\wwwroot\app_data\jobs\triggered\notifications\app.js:39:11) at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Node.js v22.5.1 Any help? I'll try also to add that snippet of code for uncaught exceptions. I didn't try to redeploy yet. I tried already to upscale and outscale the instance before opening the ticket, but with no success.

Answer 1

Hi Piercarlo Serena,

Thanks for the below details that would really help other community member, I am attaching it here so that it will be a overall detail regarding the mentioned issue:

"Cause: it was caused by an updated version of Pino (only on staging, for the moment), our logging library. The process was exiting without flushing the transport.

Solution: deactivate Pino on production, since it's not even used on the WebJobs dashboard."

below are the troubleshooting steps-----------------------

Might be an issue with the execution environment of your WebJob in the staging slot. This could be due to an incorrect Node.js version, missing dependencies, file encoding issues, or deployment artifacts.

Please ensure the Node.js version in your staging slot matches the one in production. Open Kudu Console (https://your-staging-slot.scm.azurewebsites.net) and run node -v. If the version differs from production, enforce the correct one in package.json by adding:

"engines": {
  "node": "18.x"
}

Next, manually run the WebJob to capture errors. In Kudu Debug Console, navigate to the WebJob directory using cd D:\home\data\jobs\triggered\YourJobName and execute node index.js. This should reveal any runtime errors that aren’t visible in the WebJob logs.

To get more details on failures, modify your index.js script to log uncaught exceptions and unhandled promise rejections:

process.on('uncaughtException', (err) => {
    console.error('Uncaught Exception:', err);
    process.exit(1);
});

process.on('unhandledRejection', (reason, promise) => {
    console.error('Unhandled Rejection at:', promise, 'reason:', reason);
    process.exit(1);
});

Restart the WebJob and check the logs again to see if this surfaces any useful error messages.

Another common issue is missing environment variables. Run printenv in Kudu and compare the output with the production slot. If any variables are missing, add them via Azure Portal → Configuration. Similarly, ensure that all dependencies are installed correctly. In Kudu, navigate to the WebJob directory and run npm install --production to reinstall missing or corrupted modules.

If the WebJob still fails, consider redeploying it. Delete the WebJob directory in Kudu using rm -rf D:\home\data\jobs\triggered\YourJobName, then redeploy the WebJob manually or trigger a fresh deployment via CI/CD.

Lastly, check if the WebJob is consuming excessive CPU or memory resources. If CPU usage reaches near 100%, visit Azure App Service → Diagnose and Solve Problems to monitor resource utilization. Scaling up (higher pricing tier) or out (more instances) may help in such cases.

If the issue persists, enable Application Logging in Azure Portal (D:\home\LogFiles\Application\) and analyze the logs for further details.

If you have any further assistant, do let me know.

If the answer is helpful, please click Accept Answer and kindly upvote it so that other people who faces similar issue may get benefitted from it.

Answer 2

Piercarlo Serena 20

Ok I found the cause and the solution.

Cause: it was caused by an updated version of Pino (only on staging, for the moment), our logging library. The process was exiting without flushing the transport.

Solution: deactivate Pino on production, since it's not even used on the WebJobs dashboard.

That also explains why on production slot was working. Still doesn't explain why only this job is affected while the others not, but whatever.

Thanks for pointing me out where and how to dig deeper into this!

Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-04T11:07:49.75+00:00

Hi Piercarlo Serena,

Glad that you have got solution point where you mentioned the cause and solution , it would really help other community member who has similar issue.

As you cannot accept you own answer, i would request you to accept the answer and kindly upvote it which i have posted with overall details, so that it will be beneficial for other members.

Thanks!

Share via

One of my WebJobs keeps failing with no exit code 1, but no other errors in log. All the others work fine.

1 additional answer

Your answer