General Question about HA / Site Resiliance for SCORCH Deployment

OdgeUK1 1 Reputation point
2021-05-05T16:06:48.44+00:00

Hi All

Experienced SCOM admin here, but SCORCH noob. Standing up a small SCORCH infrastructure for the first time and needing to consider HA and Site resilience.

The design will have an Always-On Clustered DB, and at least two Runbook Servers, but is it possible to have one node of the DB cluster and the secondary Runbook server, at another site? Will performance be an issue if, say, Site A DB server fails over to Site B DB server? I know with SCOM, it's really important for the Management Servers and DBs to be co-located, as anything more than 10ms latency can impact. Is this the case for SCORCH? We are literally setting it up to run one Runbook at the moment, which will be executing very often (ingesting alerts from SCOM).

If not, I'll literally need to have two instances of SCORCH, one at each site, which I'd prefer to avoid as the failover instance, complete with an expensive SQL install, will be redundant 99% of the time.

Also, it looks like the Management Server cannot be HA. So if you lose your MGT Server catastrophically, can you still rebuild that and reconnect with your Runbooks via Designer?

System Center Orchestrator
System Center Orchestrator
A family of System Center products that provide an automation platform for orchestrating and integrating both Microsoft and non-Microsoft IT tools.
217 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Leon Laude 85,681 Reputation points
    2021-05-05T16:22:10.06+00:00

    Hi @OdgeUK1 ,

    I'm not sure if it's supported to have multi-site for the Orchestrator databases, although it might work as per the following old TechNet thread:
    Orchestrator Database on SQL AlwaysOn not working correctly

    Orchestrator just like SCOM also require to have a low latency between the Runbook servers and the Orchestrator database, so I would be very careful when planning a multi-site deployment.

    The management server cannot be highly available, but having multiple runbook servers will automatically provide you with high availability for your runbooks.
    If let's say the management server goes down or crashes, it can easily be rebuilt pretty quick, it does not hinder the runbooks to continue running though.

    ----------

    (If the reply was helpful please don't forget to upvote and/or accept as answer, thank you)

    Best regards,
    Leon

    0 comments No comments

  2. Andreas Baumgarten 98,621 Reputation points MVP
    2021-05-05T18:27:40.683+00:00

    Hi @OdgeUK1 ,

    maybe Azure Automation in combination with Hybrid Worker is an option for a HA and Site Resilience automation solution?
    https://learn.microsoft.com/en-us/azure/automation/automation-hybrid-runbook-worker

    ----------

    (If the reply was helpful please don't forget to upvote and/or accept as answer, thank you)

    Regards
    Andreas Baumgarten

    0 comments No comments