CCP_RESTART on HPC with Linux workstations

Sven 21 Reputation points
2021-10-08T13:41:50.95+00:00

Hi,

CCP_RESTART usage with Windows based machines on HPC cluster just seem to run fine.
Now having the need to include Linux machines that specific setting doesn't do the job.
Doing a trick around a "sudo reboot" command, just let's the machine reboot 3 times.
Afterwards the task is of state "Failed".

Now the question would be: will the reboot/restart feature be supported as well for Linux machines soon
or to get a workaround: how can I limit the automatic task requeue, setting "AutoRequeueCount" to e.g. 0 or 1
doesn't seem to do the job to let the machine just boot once only.

BR Sven

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
0 comments No comments
{count} votes

Answer accepted by question author
  1. vipullag-MSFT 26,522 Reputation points Moderator
    2021-10-14T13:19:06.757+00:00

    @Sven

    Apologies for the delayed response on this.

    CCP_RESTART is supported only on Windows nodes. There is no recent update to support this on Linux nodes.

    There are cluster configurations for job/task retry times (by default 3) for system failures.

    140641-image.png

    Hope this helps.
    Please 'Accept as answer' if the provided information is helpful, so that it can help others in the community looking for help on similar topics.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.