Service fabric upgrade stuck in Upgrading (Updating default service(s))

Jose Fernandez Alameda 1 Reputation point
2020-10-09T10:18:39.45+00:00

Service Fabric Runtime Version: 7.1.458.9590

Environment: Azure 5 nodes cluster

I started a rolling upgrade via Azure Devops as usual (I am using the regular deploy SF application template). This time out of the blue the RollingUpgrade got stuck.

The strange thing is that the duration of the upgrade is still at 00:00

UpgradeDuration                         : 00:00:00
CurrentUpgradeDomainDuration            : 00:00:00

I tried following the recommendations outlined on this post (https://stackoverflow.com/questions/51664608/how-do-i-stop-service-fabric-application-upgrade), particularly the last one:

You can change the Upgrade Domain Timeout and the Upgrade Timeout of a running upgrade by invoking the Update-ServiceFabricApplicationUpgrade command in a Service Fabric Powershell task.

You can change the Upgrade Domain Timeout and the Upgrade Timeout of a running upgrade by invoking the Update-ServiceFabricApplicationUpgrade command in a Service Fabric Powershell task.

This has no impact, and the cluster is still in application upgrade. I also tried using -ForceRestart 1 and -Force.

This is very misfortunate, as I had to delete two services manually that are being deleted on the new version is rolling out, meaning it has left my application broken. Is there any way I can bring those two services up even though there is a rolling upgrade in progress? One of those services is a stateless service and the other one stateful.

I would appreciate help as this is a very critical bug.

This is the hole information I get from `Get-ServiceFabricApplicationUpgrade``

cmdlet Get-ServiceFabricApplicationUpgrade at command pipeline position 1
Supply values for the following parameters:
ApplicationName: fabric:/DHCloud


ApplicationName                         : fabric:/XXXX
ApplicationTypeName                     : XXXX
TargetApplicationTypeVersion            : 1.0.1+.20201008.1
ApplicationParameters                   : XXXX
StartTimestampUtc                       : 08/10/2020 14:32:35
UpgradeState                            : RollingForwardInProgress
UpgradeStatusDetails                    : Updating default service(s)
UpgradeDuration                         : 00:00:00
CurrentUpgradeDomainDuration            : 00:00:00
NextUpgradeDomain                       : 
UpgradeDomainsStatus                    : {}
UpgradeKind                             : Rolling
RollingUpgradeMode                      : Monitored
FailureAction                           : Rollback
ForceRestart                            : True
UpgradeReplicaSetCheckTimeout           : 49710.06:28:15
HealthCheckWaitDuration                 : 00:00:00
HealthCheckStableDuration               : 00:02:00
HealthCheckRetryTimeout                 : 00:10:00
UpgradeDomainTimeout                    : 00:11:40
UpgradeTimeout                          : 00:11:40
ConsiderWarningAsError                  : False
MaxPercentUnhealthyPartitionsPerService : 0
MaxPercentUnhealthyReplicasPerPartition : 0
MaxPercentUnhealthyServices             : 0
MaxPercentUnhealthyDeployedApplications : 0
ServiceTypeHealthPolicyMap              : None

Update 1:

Reading through the Powershell ServiceFacbric I found the following command that managed to cancel the stuck upgrade.

Start-ServiceFabricApplicationRollback -ApplicationName XXXX

Still, after deleting the upgrade I end up in the same situation if trying to upgrade the application again.

Update 2:

I would like to add more information:

The upgrade only gets stuck with one upgrade in particular. This upgrade removes two services (that I have to delete manually - stateless and stateful) and adds a new docker service.

Upgrading to another version that neither adds nor removes any services goes through.

I could deploy the stuck version on a single node cluster with no issues.

Azure Service Fabric
Azure Service Fabric
An Azure service that is used to develop microservices and orchestrate containers on Windows and Linux.
262 questions
{count} votes

1 answer

Sort by: Most helpful
  1. prmanhas-MSFT 17,891 Reputation points Microsoft Employee
    2020-10-14T07:06:12.523+00:00

    @Jose Fernandez Alameda Firstly, apologies for the delay in responding on this and any inconvenience this issue may have caused.

    Can you try to follow below article which has some brief explanation about upgrading default services:

    https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-application-upgrade#upgrade-default-services

    Hope it helps!!!

    Please 'Accept as answer' if it helped, so that it can help others in the community looking for help on similar topics

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.