Stretching cluster across multiple sites doubt

Question

Stretching cluster across multiple sites doubt

ovonemty 56

I am reading this summary from the book 'Pro SQL Server 2019 Administration' from Carter.

In the section for "AlwaysOn Failover Clustering", he mentions:

A few questions:

Why he says "With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes" if in the exact previous line he says "written to disk once".

Then, why he says "Even though it is possible to stretch a cluster across multiple sites, this involves SAN replication, which means that a cluster is normally configured within a single site."

If a cluster is scretched across multiple sites and this inolves SAN replication, then.... Is this still clustering?! I thought the point of clustering was having only one disk....

If a cluster is scretched across multiple sites and this inolves SAN replication, how come this means that a cluster is configured within a single site?!?! it sound like a contradiction... I guess that if the cluster is stretched across multiple sites then cluster is configured across multiple sites, by definition....

5 answers

Your answer

Answer 1

Why he says "With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes" if in the exact previous line he says "written to disk once".
My comment: If you are running a Failover Cluster Instance (FCI), that's not a SQL Server-level High-Availability solution, instead it's a Windows Cluster solution.
So in the paragraph you have screenshotted, the author is referring to the HA-solutions available from within SQL Server (often in combination with a Windows cluster, as is the case for Availability Groups).

Then, why he says "Even though it is possible to stretch a cluster across multiple sites, this involves SAN replication, which means that a cluster is normally configured within a single site.
My comment: This refers to the HA-solutions within SQL Server, like mirroring, Availability Groups, not FCI.

If a cluster is scretched across multiple sites and this inolves SAN replication, then.... Is this still clustering?! I thought the point of clustering was having only one disk....
My comment: In my opinion, only having one disk is the weak part of FCI solution. I know some IT-operations provider claim they do "realtime" SAN replication to a secondary site. I don't agree that this is a good enough solution, as there's no way to replication on SAN level in a way that is 100% for database operations. But yes, it is still clustering - if a SAN is replicated to a secondary storage appliance, that's nothing that the clustere even needs to be aware of, it's in another infrastructure tier than the cluster.

*If a cluster is scretched across multiple sites and this inolves SAN replication, how come this means that a cluster is configured within a single site?!?! it sound like a contradiction... I guess that if the cluster is stretched across multiple sites then cluster is configured across multiple sites, by definition.... *
Not sure about this one, it's hard to say anything about that single sentence without the context (the context being the rest of the chapter or even the rest of the book 😊)

Answer 2

Why he says "With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes"

To understand the author completely you need to show us complete paragraph or may be complete page on the topic he is covering. When he says it writes once then he is pointing to SQL Server Failover Cluster Instance (FCI) but when he is referring to quote above he is talking about availability groups. In AG data is written on primary and all secondary replicas while in FCI due to shared storage data is written at one point ( that is SAN).

Then, why he says "Even though it is possible to stretch a cluster across multiple sites, this involves SAN replication, which means that a cluster is normally configured within a single site."

WSFC can be configured both at with nodes at same site and with nodes at different geographical location. That totally depends on your requirement. In case of stretched cluster you need to sync storage and make it look like it is shared with nodes at both the location.

I thought the point of clustering was having only one disk....

Normal SQL Server FCI has one shared disk but in case if you are trying for Stretch cluster you have to rely on storage replication technology so that to "make nodes believe that they are seeing the same storage". This can only be achieved by making sure upto date data is on both the sides.

I guess that if the cluster is stretched across multiple sites then cluster is configured across multiple sites, by definition....

Some people do call it multi site cluster, I do not see anything wrong in that but. WSFC has grown a lot since it was created and now can be configured in many various configurations one of which is multi site.

Answer 3

Erland Sommarskog 121.9K MVP Volunteer Moderator

I can't say that I am impressed by that passage. The Always On moniker gathers both FCI and Availability Groups under the same umbrella, so saying that only AGs is "SQL Server-level HA" is dubious. Then again, I don't know what definitions the author may have introduced prior to this paragraph.

write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes

Not really. There are two things that needs to be clarified here:

Write operation needs to be written to the transaction log of the secondaries, but not to the data file.
This applies only secondaries on synchronous mode.

Magnus Ahlkvist 76 Reputation points MVP

2020-10-19T21:55:25.247+00:00

I’m not sure I read it the same way as you do. I make a distinction between FCI, which is mainly a solution controlled outside of Sql Server while Availability Groups and mirroring are controlled largely by Sql Server. That’s the distinction I think the author is trying to make.
And for the rest: Yes, it’s overly simplified to a level that even makes it incorrect. I have as example never setup synchronous replicas in secondary datacenters, because the latency has always been too big an issue, while it has worked fine with synchronous secondary replicas within the same data Center as the primary replica.
Shashank Singh 6,251 Reputation points

2020-10-20T03:52:41.71+00:00

@Erland Sommarskog Even I am not impressed but unless I read some pages and try to get what author is saying it would not be fair to comment. That is why I just tried to clear the doubts. Hope author sees this thread and give his share of thoughts

Answer 4

Hi @maca128-1653,

Why he says "With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit >on the primary completes" if in the exact previous line he says "written to disk once".

"written to disk once" is for clustering.

"With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes"

Actually,the availability mode determines whether the primary replica waits to commit transactions on a database until a given secondary replica has written the transaction log records to disk (hardened the log).

To understanding this sentence, we need to understanding how sql server achieve HA, the next is one AG working steps:
① The logwiter of the primary replica records the log information modified by the transaction into a log buffer in the memory, and then writes it to the physical log file (log solidification);

② The logscanner of the primary replica reads the log block from the cache or log file, and then sends it to the log block decoder of AlwaysON;

Note: The decoder will search the logs for operations that require special processing, such as file stream operations, file growth, etc.

③ The primary replica transmits the log block to the secondary replica via the network;

④ and ⑤

After the secondary replica receives the log block, logwiter records the log information modified by the transaction into a log buffer in memory, and then writes it to the physical log file (log solidification). In addition, if the secondary replica is in synchronization available mode, after the log is solidified, it must also feed back information to the primary replica. The primary replica can submit the transaction after receiving the message that the secondary replica has completed solidification. If the secondary replica is in asynchronous available mode or the primary replica is in asynchronous mode, the primary replica submits the transaction and Whether it has nothing to do with whether the secondary replica has completed log curing;

⑤ Redo thread reinterprets the transactions recorded in the log on the secondary replica. The redo thread will communicate with the primary replica at regular intervals to inform it of its own work progress. The primary replica can know how far the data gap between the two sides is.

For the sentence : ""With SQL Server–level HA technologies, write operations occur on the primary database and then again on all secondary databases, before the commit on the primary completes"" , it may means the first three steps of its working.

Then, why he says "Even though it is possible to stretch a cluster across multiple sites, this involves SAN replication, which means that a cluster is normally configured >within a single site."

"Even though it is possible to strech a cluster across multiple sites..."-"multiple sites" in this sentence means physical site,they are geographically dispersed.

"...this involves SAN replication, which means that a cluster is normally configured within a single site"
A multi-subnet cluster does not share data storage among all nodes, so it is necessary to perform hardware storage-level data replication between multiple subnets, so that multiple copies of available data are obtained.
So the single site means they are need data replications, and it is one logical concept.

If a cluster is scretched across multiple sites and this inolves SAN replication, then.... Is this still clustering?! I thought the point of clustering was having only one disk....

Yes.It is.

It is one cluster or not, not determined by how many disks they use, it determined by whether they can ahcieve the function of cluter, wheter it is suitable for the definition.

Quote from this doc.: sql-server-failover-clustering

SQL Server failover clusters are made of group of servers that run cluster enabled applications in a special way to minimize downtime. A failover is a process that happens if one node crashes, or becomes unavailable and the other one takes over and restarts the application automatically without human intervention

Quote from this doc.: sql-server-multi-subnet-clustering-sql-server

A SQL Server multi-subnet failover cluster is a configuration where each failover cluster node is connected to a different subnet or different set of subnets. These subnets can be in the same location or in geographically dispersed sites. Clustering across geographically dispersed sites is sometimes referred to as stretch clusters. As there is no shared storage that all the nodes can access, data should be replicated between the data storage on the multiple subnets. With data replication, there is more than one copy of the data available. Therefore, a multi-subnet failover cluster provides a disaster recovery solution in addition to high availability.

If a cluster is scretched across multiple sites and this inolves SAN replication, how come this means that a cluster is configured within a single site?!?! it sound like a contradiction... I guess that if the cluster is stretched across multiple sites then cluster is configured across multiple sites, by definition....

The single site comes from the author's understanding, I guess it is just one logical concept.

BR,
Mia

If the answer is helpful, please click "Accept Answer" and upvote it.

Answer 5

m 4,276

Hi @maca128-1653,

Is the reply helpful?

BR,
Mia

If the answer is helpful, please click "Accept Answer" and upvote it.

Share via

Stretching cluster across multiple sites doubt

5 answers

Your answer