SSIS Scale Out worker is not showing up in Worker Agents

 

Understanding of the issue:

Adding the scale out worker to the master shows successful but this does not show up in the SQL Server integration services – Manage Scale Out (ISManager). Also, this will not be added in [SSISDB].[internal].[worker_agents]

Adding SSIS worker completes in the wizard as below

After this finished, if we check the ISManager or worker_agents it is not showing up.

 

We can find the logs for both master and worker in below locations

Master: <path>:\Users\[account]\AppData\Local\SSIS\ScaleOut\Master

(Example: C:\Users\SSISScaleOutMaster140\AppData\Local\SSIS\ScaleOut\Master)

 

Worker: <path>:\Users\[account]\AppData\Local\SSIS\ScaleOut\Agent

(Example: C:\Users\SSISScaleOutWorker140\AppData\Local\SSIS\ScaleOut\Agent)

 

Things to verify for this issue:

  • The port is open

We need port that is being used by Master service to be open in Firewall on the master machine. By default, the port is 8391.

Possible error in worker error log if the port is not open:

System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://Master.domain.com:8391/ClusterManagement/ that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond <MasterIP>:8391

   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)

   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)

   --- End of inner exception stack trace ---

   at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)

   at System.Net.HttpWebRequest.GetRequestStream()

   at System.ServiceModel.Channels.HttpOutput.WebRequestHttpOutput.GetOutputStream()

 

  • Validate certificates in certificate store

Worker server: Worker machine needs to have master certificate in its trusted root and the local worker certificate in personal.

If Master certificate is not present, we can find the copy of the certificate in master machine in DTS\Binn location (Example: C:\Program Files\Microsoft SQL Server\140\DTS\Binn\SSISScaleOutMaster.cer)

We need to copy it from this location and install the certificate on worker

 

Master server: Master server needs to have master certificate as well as the worker certificate in its trusted root.

If the worker certificate is not present we can find it in DTS\Binn location of worker server (Example: "C:\Program Files\Microsoft SQL Server\140\DTS\Binn\SSISScaleOutWorker.cer" )

Note: By default, when we try to add the worker, it will try to make appropriate changes to the worker config file as well as install the certificates

 

  • Validate worker configuration

Worker config file location: \DTS\Bin\ WorkerSettings.config

Example: C:\Program Files\Microsoft SQL Server\140\DTS\Binn\WorkerSettings.config

 

Possible error in worker log if configured to incorrect master name:

Error when sending agent heartbeat.

System.ServiceModel.EndpointNotFoundException:

There was no endpoint listening at https://wrongmaster.domain.com:8391/ClusterManagement/ that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: The remote name could not be resolved: 'wrongmaster.domain.com'

at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)

at System.Net.HttpWebRequest.GetRequestStream()

at System.ServiceModel.Channels.HttpOutput.WebRequestHttpOutput.GetOutputStream()

 

  • Verify if the Master end point and certificate thumbprints are mapped properly
 "MasterEndpoint": "https://masterserver.domain.com:8391",
"MasterHttpsCertThumbprint": "<Master Certificate Thumbprint here> ",
"WorkerHttpsCertThumbprint": "<Worker Certificate Thumbprint here>",

 

To get the certificate thumbprint,

DoubleClick on certificate -> Details -> Thumbprint

Make sure the config file is corrected and restart the worker service.

After all the above settings are verified, we have to add the worker from ISManager again.

If the issue is still present, check if the worker service is running under domain account, in that case, we can try changing it to local system and see if that works. After adding the worker, we can change the account back to domain account and restart the worker service.

 

Additional information:

If we are having issue with the master service itself not shown online or getting error saying master service is not installed on this server, we can check the master configuration to map it to correct instance

Master config file location:

<path>:\Program Files\Microsoft SQL Server\140\DTS\Binn\MasterSettings.config

 "PortNumber": 8391,
"SSLCertThumbprint": "<Master certificate thumbprint>",
"SqlServerName": "<masterserver\\instancename >",
"CleanupCompletedJobsIntervalInMs":43200000,
"DealWithExpiredTasksIntervalInMs":300000,
"MasterHeartbeatIntervalInMs":30000,
"SqlConnectionTimeoutInSecs":15

By changing the server name here, we can map the master to different instance.

 

Author:     Chaitra Hegde – Support Engineer, SQL Server BI Developer team, Microsoft

Reviewer:   Krishnakumar Rukmangathan – Support Escalation Engineer, SQL Server BI Developer team, Microsoft