Configure stateful reliable services

2025-04-13

There are two sets of configuration settings for reliable services. One set is global for all reliable services in the cluster while the other set is specific to a particular reliable service.

Konfiguracja globalna

The global reliable service configuration is specified in the cluster manifest for the cluster under the KtlLogger section. Umożliwia konfigurację lokalizacji i rozmiaru udostępnionego dziennika oraz globalnych limitów pamięci używanych przez rejestrator. Manifest klastra to pojedynczy plik XML, który zawiera ustawienia i konfiguracje, które mają zastosowanie do wszystkich węzłów i usług w klastrze. Plik jest zwykle nazywany ClusterManifest.xml. Manifest klastra dla klastra można wyświetlić przy użyciu polecenia Get-ServiceFabricClusterManifest powershell.

Nazwy konfiguracji

Nazwa	Jednostka	Wartość domyślna	Uwagi
WriteBufferMemoryPoolMinimumInKB	Kilobajty	8388608	Minimum number of KB to allocate in kernel mode for the logger write buffer memory pool. Ta pula pamięci jest używana do buforowania informacji o stanie przed zapisaniem na dysku.
WriteBufferMemoryPoolMaximumInKB	Kilobajty	Brak limitu	Maksymalny rozmiar, do którego może rosnąć pula pamięci buforu zapisu rejestratora.
SharedLogId	GUID (Globalny Unikalny Identyfikator)	""	Określa unikatowy identyfikator GUID do użycia do identyfikowania domyślnego udostępnionego pliku dziennika używanego przez wszystkie usługi niezawodne we wszystkich węzłach w klastrze, które nie określają identyfikatora SharedLogId w określonej konfiguracji usługi. Jeśli określono identyfikator SharedLogId, należy również określić parametr SharedLogPath.
SharedLogPath	Fully qualified path name	""	Specifies the fully qualified path where the shared log file used by all reliable services on all nodes in the cluster that do not specify the SharedLogPath in their service specific configuration. Jeśli jednak parametr SharedLogPath jest określony, należy również określić identyfikator SharedLogId.
SharedLogSizeInMB	Megabajty	8192	Określa liczbę MB miejsca na dysku do statycznego przydzielenia dla udostępnionego dziennika. Wartość musi być 2048 lub większa.

In Azure ARM or on-premises JSON template, the example below shows how to change the shared transaction log that gets created to back any reliable collections for stateful services.

"fabricSettings": [{
    "name": "KtlLogger",
    "parameters": [{
        "name": "SharedLogSizeInMB",
        "value": "4096"
    }]
}]

Sample local developer cluster manifest section

If you want to change this on your local development environment, you need to edit the local clustermanifest.xml file.

   <Section Name="KtlLogger">
     <Parameter Name="SharedLogSizeInMB" Value="4096"/>
     <Parameter Name="WriteBufferMemoryPoolMinimumInKB" Value="8192" />
     <Parameter Name="WriteBufferMemoryPoolMaximumInKB" Value="8192" />
     <Parameter Name="SharedLogId" Value="{7668BB54-FE9C-48ed-81AC-FF89E60ED2EF}"/>
     <Parameter Name="SharedLogPath" Value="f:\SharedLog.Log"/>
   </Section>

Uwagi

The logger has a global pool of memory allocated from non paged kernel memory that is available to all reliable services on a node for caching state data before being written to the dedicated log associated with the reliable service replica. Rozmiar puli jest kontrolowany przez ustawienia WriteBufferMemoryPoolMinimumInKB i WriteBufferMemoryPoolMaximumInKB. WriteBufferMemoryPoolMinimumInKB określa zarówno początkowy rozmiar tej puli pamięci, jak i najniższy rozmiar, do którego pula pamięci może się zmniejszyć. WriteBufferMemoryPoolMaximumInKB jest najwyższym rozmiarem, do którego może rosnąć pula pamięci. Each reliable service replica that is opened may increase the size of the memory pool by a system determined amount up to WriteBufferMemoryPoolMaximumInKB. Jeśli istnieje większe zapotrzebowanie na pamięć z puli pamięci niż jest dostępne, żądania pamięci będą opóźniane do momentu udostępnienia pamięci. W związku z tym, jeśli pula pamięci buforu zapisu jest zbyt mała dla określonej konfiguracji, wydajność może ucierpieć.

Ustawienia SharedLogId i SharedLogPath są zawsze używane razem do definiowania identyfikatora GUID i lokalizacji domyślnego dziennika udostępnionego dla wszystkich węzłów w klastrze. The default shared log is used for all reliable services that do not specify the settings in the settings.xml for the specific service. For best performance, shared log files should be placed on disks that are used solely for the shared log file to reduce contention.

SharedLogSizeInMB określa ilość miejsca na dysku do wstępnego przeznaczenia dla domyślnego dziennika udostępnionego na wszystkich węzłach. Parametr SharedLogId i SharedLogPath nie muszą być określone w celu określenia parametru SharedLogSizeInMB.

Service Specific Configuration

You can modify stateful Reliable Services' default configurations by using the configuration package (Config) or the service implementation (code).

Config - Configuration via the config package is accomplished by changing the Settings.xml file that is generated in the Microsoft Visual Studio package root under the Config folder for each service in the application.
Code - Configuration via code is accomplished by creating a ReliableStateManager using a ReliableStateManagerConfiguration object with the appropriate options set.

By default, the Azure Service Fabric runtime looks for predefined section names in the Settings.xml file and consumes the configuration values while creating the underlying runtime components.

Uwaga

Do not delete the section names of the following configurations in the Settings.xml file that is generated in the Visual Studio solution unless you plan to configure your service via code. Renaming the config package or section names will require a code change when configuring the ReliableStateManager.

Konfiguracja zabezpieczeń replikatora

Konfiguracje zabezpieczeń replikatora służą do zabezpieczania kanału komunikacyjnego używanego podczas replikacji. This means that services will not be able to see each other's replication traffic, ensuring that the data that is made highly available is also secure. Domyślnie pusta sekcja konfiguracji zabezpieczeń uniemożliwia zabezpieczenia replikacji.

Ważne

W węzłach systemu Linux certyfikaty muszą być sformatowane jako PEM. Aby dowiedzieć się więcej na temat lokalizowania i konfigurowania certyfikatów dla systemu Linux, zobacz Konfigurowanie certyfikatów w systemie Linux.

Default section name

ReplicatorSecurityConfig

Uwaga

To change this section name, override the replicatorSecuritySectionName parameter to the ReliableStateManagerConfiguration constructor when creating the ReliableStateManager for this service.

Konfiguracja replikatora

Replicator configurations configure the replicator that is responsible for making the stateful Reliable Service's state highly reliable by replicating and persisting the state locally. Domyślna konfiguracja jest generowana przez szablon programu Visual Studio i powinna wystarczyć. W tej sekcji omówiono dodatkowe konfiguracje, które są dostępne do dostrajania replikatora.

Default section name

ReplicatorConfig

Uwaga

To change this section name, override the replicatorSettingsSectionName parameter to the ReliableStateManagerConfiguration constructor when creating the ReliableStateManager for this service.

Nazwy konfiguracji

Nazwa	Jednostka	Wartość domyślna	Uwagi
BatchAcknowledgementInterval	Sekund	0.015	Time period for which the replicator at the secondary waits after receiving an operation before sending back an acknowledgement to the primary. Wszelkie inne potwierdzenia wysyłane dla operacji przetworzonych w tym interwale są wysyłane jako jedna odpowiedź.
ReplicatorEndpoint	N/A	Brak domyślnego parametru – wymagany parametr	Adres IP i port używany przez replikator podstawowy/pomocniczy do komunikowania się z innymi replikatorami w zestawie replik. Powinno to odwoływać się do punktu końcowego zasobu TCP w manifeście usługi. Refer to Service manifest resources to read more about defining endpoint resources in a service manifest.
MaxPrimaryReplicationQueueSize	Liczba operacji	8192	Maximum number of operations in the primary queue. An operation is freed up after the primary replicator receives an acknowledgement from all the secondary replicators. This value must be greater than 64 and a power of 2.
MaxSecondaryReplicationQueueSize	Liczba operacji	16384	Maximum number of operations in the secondary queue. An operation is freed up after making its state highly available through persistence. This value must be greater than 64 and a power of 2.
CheckpointThresholdInMB	MB	50	Amount of log file space after which the state is checkpointed.
MaxRecordSizeInKB	KB	1024	Największy rozmiar rekordu, który replikator może zapisywać w dzienniku. Ta wartość musi być wielokrotną 4 i większą niż 16.
MinLogSizeInMB	MB	0 (system determined)	Minimum size of the transactional log. The log will not be allowed to truncate to a size below this setting. 0 indicates that the replicator will determine the minimum log size. Increasing this value increases the possibility of doing partial copies and incremental backups since chances of relevant log records being truncated is lowered.
TruncationThresholdFactor	Czynnik	2	Determines at what size of the log, truncation will be triggered. Truncation threshold is determined by MinLogSizeInMB multiplied by TruncationThresholdFactor. TruncationThresholdFactor must be greater than 1. MinLogSizeInMB * TruncationThresholdFactor must be less than MaxStreamSizeInMB.
ThrottlingThresholdFactor	Czynnik	4	Determines at what size of the log, the replica will start being throttled. Throttling threshold (in MB) is determined by Max((MinLogSizeInMB * ThrottlingThresholdFactor),(CheckpointThresholdInMB * ThrottlingThresholdFactor)). Throttling threshold (in MB) must be greater than truncation threshold (in MB). Truncation threshold (in MB) must be less than MaxStreamSizeInMB.
MaxAccumulatedBackupLogSizeInMB	MB	800	Max accumulated size (in MB) of backup logs in a given backup log chain. An incremental backup requests will fail if the incremental backup would generate a backup log that would cause the accumulated backup logs since the relevant full backup to be larger than this size. In such cases, user is required to take a full backup.
SharedLogId	GUID (Globalny Unikalny Identyfikator)	""	Specifies a unique GUID to use for identifying the shared log file used with this replica. Zazwyczaj usługi nie powinny używać tego ustawienia. Jeśli jednak określono identyfikator SharedLogId, należy również określić parametr SharedLogPath.
SharedLogPath	Fully qualified path name	""	Określa w pełni kwalifikowaną ścieżkę, w której zostanie utworzony udostępniony plik dziennika dla tej repliki. Zazwyczaj usługi nie powinny używać tego ustawienia. Jeśli jednak parametr SharedLogPath jest określony, należy również określić identyfikator SharedLogId.
SlowApiMonitoringDuration	Sekund	300	Sets the monitoring interval for managed API calls. Example: user provided backup callback function. After the interval has passed, a warning health report will be sent to the Health Manager.
LogTruncationIntervalSeconds	Sekund	0	Configurable interval at which log truncation will be initiated on each replica. It is used to ensure log is also truncated based on time instead of just log size. This setting also forces purge of deleted entries in reliable dictionary. Hence it can be used to ensure deleted items are purged in a timely manner.
EnableStableReads	Boolean	Nieprawda	Enabling stable reads restricts secondary replicas to returning values which have been quorum-acked.

Sample configuration via code

class Program
{
    /// <summary>
    /// This is the entry point of the service host process.
    /// </summary>
    static void Main()
    {
        ServiceRuntime.RegisterServiceAsync("HelloWorldStatefulType",
            context => new HelloWorldStateful(context, 
                new ReliableStateManager(context, 
        new ReliableStateManagerConfiguration(
                        new ReliableStateManagerReplicatorSettings()
            {
                RetryInterval = TimeSpan.FromSeconds(3)
                        }
            )))).GetAwaiter().GetResult();
    }
}

class MyStatefulService : StatefulService
{
    public MyStatefulService(StatefulServiceContext context, IReliableStateManagerReplica stateManager)
        : base(context, stateManager)
    { }
    ...
}

Przykładowy plik konfiguracji

<?xml version="1.0" encoding="utf-8"?>
<Settings xmlns:xsd="https://www.w3.org/2001/XMLSchema" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.microsoft.com/2011/01/fabric">
   <Section Name="ReplicatorConfig">
      <Parameter Name="ReplicatorEndpoint" Value="ReplicatorEndpoint" />
      <Parameter Name="BatchAcknowledgementInterval" Value="0.05"/>
      <Parameter Name="CheckpointThresholdInMB" Value="512" />
   </Section>
   <Section Name="ReplicatorSecurityConfig">
      <Parameter Name="CredentialType" Value="X509" />
      <Parameter Name="FindType" Value="FindByThumbprint" />
      <Parameter Name="FindValue" Value="9d c9 06 b1 69 dc 4f af fd 16 97 ac 78 1e 80 67 90 74 9d 2f" />
      <Parameter Name="StoreLocation" Value="LocalMachine" />
      <Parameter Name="StoreName" Value="My" />
      <Parameter Name="ProtectionLevel" Value="EncryptAndSign" />
      <Parameter Name="AllowedCommonNames" Value="My-Test-SAN1-Alice,My-Test-SAN1-Bob" />
   </Section>
</Settings>

Uwagi

BatchAcknowledgementInterval controls replication latency. Wartość "0" powoduje najmniejsze możliwe opóźnienie kosztem przepływności (ponieważ należy wysłać i przetworzyć więcej komunikatów potwierdzenia, z których każda zawiera mniej potwierdzeń). Im większa wartość parametru BatchAcknowledgementInterval, tym większa jest ogólna przepływność replikacji kosztem większego opóźnienia operacji. Przekłada się to bezpośrednio na opóźnienie zatwierdzeń transakcji.

The value for CheckpointThresholdInMB controls the amount of disk space that the replicator can use to store state information in the replica's dedicated log file. Zwiększenie tej wartości do poziomu wyższego niż domyślny może skutkować szybszymi czasami rekonfiguracji, gdy nowa replika zostanie dodana do zestawu. This is due to the partial state transfer that takes place due to the availability of more history of operations in the log. Może to potencjalnie zwiększyć czas odzyskiwania repliki po awarii.

Ustawienie MaxRecordSizeInKB definiuje maksymalny rozmiar rekordu, który można zapisać przez replikatora w pliku dziennika. W większości przypadków domyślny rozmiar rekordu 1024 KB jest optymalny. Jeśli jednak usługa powoduje, że większe elementy danych są częścią informacji o stanie, może być konieczne zwiększenie tej wartości. Nie ma większych korzyści z ustawiania MaxRecordSizeInKB na mniej niż 1024, ponieważ mniejsze rekordy używają tylko tyle miejsca, ile jest potrzebne dla mniejszego rekordu. We expect that this value would need to be changed in only rare cases.

Ustawienia SharedLogId i SharedLogPath są zawsze używane razem, aby usługa mogła korzystać z oddzielnego dziennika udostępnionego zamiast domyślnego dziennika udostępnionego dla węzła. For best efficiency, as many services as possible should specify the same shared log. Shared log files should be placed on disks that are used solely for the shared log file to reduce head movement contention. We expect that this value would need to be changed in only rare cases.

Udostępnij za pośrednictwem

Configure stateful reliable services

Konfiguracja globalna

Nazwy konfiguracji

Sample local developer cluster manifest section

Uwagi

Service Specific Configuration

Konfiguracja zabezpieczeń replikatora

Default section name

Konfiguracja replikatora

Default section name

Nazwy konfiguracji

Sample configuration via code

Przykładowy plik konfiguracji

Uwagi

Następne kroki

Opinia

Dodatkowe zasoby