One of my WebJobs keeps failing with no exit code 1, but no other errors in log. All the others work fine.

Piercarlo Serena 20 Reputation points
2025-04-02T14:16:26.56+00:00

I have an Azure App Service (Web App - Windows), with a bunch of triggered WebJobs, all written in Node.js. The App Service has 3 deployment slots - 1 for production and 2 for testing/staging purposes.

All of them are working as expected, but for some time one in particular, on the staging slot, keeps failing.User's image

No execution errors, just:

[04/02/2025 14:01:09 > a6c961: SYS INFO] Status changed to Failed
[04/02/2025 14:01:09 > a6c961: SYS ERR ] Job failed due to exit code 1

All the others are still working fine, even the same job on the other two slots.

I tried both upscaling and outscaling the App Service, since when this job was triggered, at one point it reaches a CPU level close to 100%, but with no success.

It is not a timeout error since the job fails after 2 mins (usually taking 3-4 mins).

If I launch the code locally on my computer, it works as expected so I cannot be an error on code-level.

Any helps is appreciated. Thanks

Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
8,940 questions
{count} votes

Accepted answer
  1. Siva Nair 2,420 Reputation points Microsoft External Staff Moderator
    2025-04-04T06:48:46.2633333+00:00

    Hi Piercarlo Serena,

    Thanks for the below details that would really help other community member, I am attaching it here so that it will be a overall detail regarding the mentioned issue:

    "Cause: it was caused by an updated version of Pino (only on staging, for the moment), our logging library. The process was exiting without flushing the transport.

    Solution: deactivate Pino on production, since it's not even used on the WebJobs dashboard."

    below are the troubleshooting steps-----------------------

    Might be an issue with the execution environment of your WebJob in the staging slot. This could be due to an incorrect Node.js version, missing dependencies, file encoding issues, or deployment artifacts.

    Please ensure the Node.js version in your staging slot matches the one in production. Open Kudu Console (https://your-staging-slot.scm.azurewebsites.net) and run node -v. If the version differs from production, enforce the correct one in package.json by adding:

    "engines": {
      "node": "18.x"
    }
    

    Next, manually run the WebJob to capture errors. In Kudu Debug Console, navigate to the WebJob directory using cd D:\home\data\jobs\triggered\YourJobName and execute node index.js. This should reveal any runtime errors that aren’t visible in the WebJob logs.

    To get more details on failures, modify your index.js script to log uncaught exceptions and unhandled promise rejections:

    process.on('uncaughtException', (err) => {
        console.error('Uncaught Exception:', err);
        process.exit(1);
    });
    
    process.on('unhandledRejection', (reason, promise) => {
        console.error('Unhandled Rejection at:', promise, 'reason:', reason);
        process.exit(1);
    });
    

    Restart the WebJob and check the logs again to see if this surfaces any useful error messages.

    Another common issue is missing environment variables. Run printenv in Kudu and compare the output with the production slot. If any variables are missing, add them via Azure Portal → Configuration. Similarly, ensure that all dependencies are installed correctly. In Kudu, navigate to the WebJob directory and run npm install --production to reinstall missing or corrupted modules.

    If the WebJob still fails, consider redeploying it. Delete the WebJob directory in Kudu using rm -rf D:\home\data\jobs\triggered\YourJobName, then redeploy the WebJob manually or trigger a fresh deployment via CI/CD.

    Lastly, check if the WebJob is consuming excessive CPU or memory resources. If CPU usage reaches near 100%, visit Azure App Service → Diagnose and Solve Problems to monitor resource utilization. Scaling up (higher pricing tier) or out (more instances) may help in such cases.

    If the issue persists, enable Application Logging in Azure Portal (D:\home\LogFiles\Application\) and analyze the logs for further details.

    If you have any further assistant, do let me know.

    If the answer is helpful, please click Accept Answer and kindly upvote it so that other people who faces similar issue may get benefitted from it.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Piercarlo Serena 20 Reputation points
    2025-04-04T09:18:56.92+00:00

    Ok I found the cause and the solution.

    Cause: it was caused by an updated version of Pino (only on staging, for the moment), our logging library. The process was exiting without flushing the transport.

    Solution: deactivate Pino on production, since it's not even used on the WebJobs dashboard.

    That also explains why on production slot was working. Still doesn't explain why only this job is affected while the others not, but whatever.

    Thanks for pointing me out where and how to dig deeper into this!


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.