Troubleshoot Scale Out
Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory
SSIS Scale Out involves communication among the SSIS Catalog database SSISDB
, the Scale Out Master service, and the Scale Out Worker service. Sometimes this communication breaks due to configuration mistakes, lack of access permissions, and other reasons. This article helps you to troubleshoot issues with your Scale Out configuration.
To investigate the symptoms you encounter, follow the steps below one by one until your problem is resolved.
Scale Out Master fails
Symptoms
Scale Out Master cannot connect to SSISDB.
Master properties cannot show in Scale Out Manager.
Master properties are not populated in the view
[catalog].[master_properties]
.
Solution
Check whether Scale Out is enabled.
In SSMS, in Object Explorer, right-click SSISDB and check Scale Out feature is enabled.
If the property value is False, enable Scale Out by calling the stored procedure
[catalog].[enable_scaleout]
.Check whether the SQL Server name specified in the Scale Out Master configuration file is correct, and restart the Scale Out Master service.
Scale Out Worker fails
Symptoms
Scale Out Worker cannot connect to Scale Out Master.
Scale Out Worker does not show after adding it in Scale Out Manager.
Scale Out Worker does not show in the view
[catalog].[worker_agents]
.The Scale Out Worker service is running, but the Scale Out Worker is offline.
Solution
Check the error messages in the Scale Out Worker service log under \<drive\>:\Users\\*[account running worker service]*\AppData\Local\SSIS\Cluster\Agent
.
No endpoint listening
Symptoms
"System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://[MachineName]:[Port]/ClusterManagement/ that could accept the message."
Solution
Check whether the port number specified in the Scale Out Master service configuration file is correct, and restart the Scale Out Master service.
Check whether the master endpoint specified in the Scale Out Worker service configuration is correct, and restart the Scale Out Worker service.
Check whether the firewall port is open on the Scale Out Master node.
Resolve any other connection issues between the Scale Out Master node and the Scale Out Worker node.
Could not establish trust relationship
Symptoms
""System.ServiceModel.Security.SecurityNegotiationException: Could not establish trust relationship for the SSL/TLS secure channel with authority '[Machine Name]:[Port]'."
"System.Net.WebException: The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel."
"System.Security.Authentication.AuthenticationException: The remote certificate is invalid according to the validation procedure."
Solution
Install the Scale Out Master certificate to the Root certificate store of the local computer on the Scale Out Worker node, if the certificate is not yet installed, and restart the Scale Out Worker service.
Check whether the host name in the master endpoint is included in the CNs of the Scale Out Master certificate. If not, reset the master endpoint in the Scale Out Worker configuration file, and restart the Scale Out Worker service.
Note
If it's not possible to change the host name of the master endpoint due to DNS settings, you have to change the Scale Out Master certificate. See Manage certificates for SSIS Scale Out.
Check whether the master thumbprint specified in the Scale Out Worker configuration matches the thumbprint of the Scale Out Master certificate.
Could not establish secure channel
Symptoms
"System.ServiceModel.Security.SecurityNegotiationException: Could not establish secure channel for SSL/TLS with authority '[Machine Name]:[Port]'."
"System.Net.WebException: The request was aborted: Could not create SSL/TLS secure channel."
Solution
Check whether the account running the Scale Out Worker service has access to the Scale Out Worker certificate by running the following command:
winhttpcertcfg.exe -l -c LOCAL_MACHINE\MY -s {CN of the worker certificate}
If the account does not have access, grant access by running the following command, and restart Scale Out Worker service.
winhttpcertcfg.exe -g -c LOCAL_MACHINE\My -s {CN of the worker certificate} -a {the account running Scale Out Worker service}
HTTP request forbidden
Symptoms
"System.ServiceModel.Security.MessageSecurityException: The HTTP request was forbidden with client authentication scheme 'Anonymous'."
"System.Net.WebException: The remote server returned an error: (403) Forbidden."
Solution
Install the Scale Out Worker certificate to the Root certificate store of the local computer on the Scale Out Master node, if the certificate is not yet installed, and restart the Scale Out Worker service.
Clean up useless certificates in the Root certificate store of the local computer on the Scale Out Master node.
Configure Schannel to no longer send the list of trusted root certification authorities during the TLS/SSL handshake process by adding the following registry entry on the Scale Out Master node.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL
Value name: SendTrustedIssuerList
Value type: REG_DWORD
Value data: 0 (False)
If it is not possible to clean up all non-self-signed certificates as described in step 2, set the value of the following registry key to 2.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL
Value name: ClientAuthTrustMode
Value type: REG_DWORD
Value data: 2
Note
If you have non-self-signed certificates in the Root certificate store, client certificate authentication fails. For more info, see Internet Information Services (IIS) 8 may reject client certificate requests with HTTP 403.7 or 403.16 errors.
HTTP request error
Symptoms
"System.ServiceModel.CommunicationException: An error occurred while making the HTTP request to https://[Machine Name]:[Port]/ClusterManagement/. This could be due to the fact that the server certificate is not configured properly with HTTP.SYS in the HTTPS case. This could also be caused by a mismatch of the security binding between the client and the server."
Solution
Check whether the Scale Out Master certificate is bound to the port in the master endpoint correctly on the master node by running the following command:
netsh http show sslcert ipport=0.0.0.0:{Master port}
Check whether the certificate hash displayed matches the Scale Out Master certificate thumbprint. If the binding is not correct, reset the binding by running the following commands, and restart Scale Out Worker service.
netsh http delete sslcert ipport=0.0.0.0:{Master port} netsh http add sslcert ipport=0.0.0.0:{Master port} certhash={Master certificate thumbprint} certstorename=Root appid={random guid}
Cannot open certificate store
Symptoms
Validation fails when connecting a Scale Out Worker to the Scale Out Master in Scale Out Manager with the error message, "Cannot open certificate store on the machine."
Solution
Run Scale Out Manager as administrator. If you open Scale Out Manager with SSMS, you have to run SSMS as administrator.
Start the Remote Registry service on the computer if it is not running.
Execution doesn't start
Symptoms
Execution in Scale Out does not start.
Solution
Check the status of the computers you selected to run the package in the view [catalog].[worker_agents]
. At least one worker must be online and enabled.
No log
Symptoms
Packages run successfully, but there no messages are logged.
Solution
Check whether SQL Server Authentication is allowed by the SQL Server instance that hosts SSISDB.
Note
If you have changed the account for Scale Out logging, see Change the Account for Scale Out Logging and verify the connection string used for logging.
Error messages aren't helpful
Symptoms
The error messages in the package execution report are not sufficient for troubleshooting.
Solution
More execution logs can be found under the TasksRootFolder
configured in WorkerSettings.config
. By default, this folder is \<drive\>:\Users\\[account]\AppData\Local\SSIS\ScaleOut\Tasks
. The [account] is the account running the Scale Out Worker service, with default value SSISScaleOutWorker140
.
To locate the log for the package execution with [execution ID], execute the following Transact-SQL command to get the [task ID]. Then, find the subfolder name that contains [task ID] under TasksRootFolder
.
SELECT [TaskId]
FROM [SSISDB].[internal].[tasks] tasks, [SSISDB].[internal].[executions] executions
WHERE executions.execution_id = *Your Execution Id* AND tasks.JobId = executions.job_id
Warning
This query is for troubleshooting purpose only. The internal views referenced in the query are to change in the future.
Next steps
For more info, see the following articles about setting up and configuring SSIS Scale Out: