Mirroring and Bandwidth

I just completed some testing to see how much network bandwidth between SQL nodes affects synchronous mirroring performance. The SQL mirroring white paper indicates that bandwidth does not matter much above 10Mbps, but that may lead you to the wrong conclusion and set you up for a world of hurt.

My test consisted of an automated routine that uploads and deletes a large document over and over again to a document library. I can specify the amount of churn (insert/update/delete) to generate per minute of test routine. (I'll see if I can get this on codeplex here soon) While it does not represent a full server workload, it does have the potential to create massive amounts of churn on my SQL servers which makes it a great way to test mirroring and log shipping throughput. In my particular test, I have a single collaboration portal site that contains my document library to which I upload and then delete a 49MB powerpoint file. My test involved running the aforementioned test routine under the following scenarios:

  1. No mirroring
  2. Mirroring with 10Mbps Full Duplex
  3. Mirroring with 100Mbs Full Duplex
  4. Mirroring with 1000Mbps Full Duplex

For some background, my farm is configured with two SQL2K5 Enterprise servers with all the farm databases mirrored. The SQL servers have a single 2.33 GHz quadcore proc and 16GB of RAM. Each data file for each database shares a single Raid 5 array with 5 spindles. Each log file for each database shares a single Raid 10 array with 4 spindles. I use an alias to get my SharePoint server talking to the appropriate principal SQL server. I have two web servers and a index server. Each has a single 2.33GHz quadcore proc and 8GB of RAM. The OS and application files sit on their own Raid 1 drives with two spindles each. All timer jobs are disabled and indexing is not scheduled. The web servers connect to the SQL servers via a 100Mbps connection.

I started testing with mirroring over the 1000Mbps connection. I averaged about 345MB/min or  ~5.75MB/sec. Keep in mind that this is not the max throughput, only what I could push through a single web server and its 100Mbps connection. I could push additional throughput using more web servers and/or upgrading connectivity. I have successfully pushed more than 2000MB/min in the past.

The next test I ran was mirroring over 100Mbps connection. Interesting enough is that throughput again was 345MB/min; however, the difference this time is that the pipe between my SQL servers was nearly completely utilized. Where I could probably push 345MB/min through fives servers simultaneously over the 1000Mbps connection, I can now only push a quarter of that. Very interesting.

Next, I tested over the 10Mbps connection. Basically, it's bad. I could only perform a couple of tests before my mirror would get behind. Sometimes, I couldn't even finish an upload in 5 minutes.

Finally, I tested with no mirroring. This test went as expected. I was able to generate about 380MB/min or about 6.3MB/min. Roughly the same as mirroring over 1000Mbps connection.

Impressions - Mirroring performance and throughput is highly dependent on the bandwidth available between SQL mirrors. If you need to optimize throughput between mirrors, deploy at least Gig/E connectivity between your SQL servers. Better yet, dedicate a Gig/E connection just for SQL traffic between nodes. You can do this by deploying a private connection between servers, configuring a specific IP for your database mirroring endpoint, and configuring name resolution for the nodes (via hosts file) to resolve the private IP. Perhaps, after reviewing this, you are worried about mirroring between datacenters. Remember, this tests shows synchronous mirroring performance only. Asynchronous mirroring is much more performant over low bandwidth links. (Not that asynchronous mirroring is supported for SharePoint) Also, bandwidth is getting relativity cheap. It doesn't cost too much to span a 10Gig link across 2 closely located datacenters.

Next time, I'm going to talk about the effects of latency on your throughput.