Microsoft Windows: Make the Move to DFS
If you’re still using an older file and folder replication solution, it’s high time you moved over to the Distributed File System.
The Distributed File System (DFS) has been around since the days of Windows NT. It comes in a variety of configurations and options and is available in standalone and domain configurations. DFS is a popular and effective technology that provides redundant file and folder replication between remote servers. You can organize them under a common namespace to let users connect without needing the name of the server on which the DFS share is hosted.
Unfortunately, there has never been a comprehensive DFS best practices document. Here’s a summary of all the best practices used, learned and recommended over the years. New information is always posted to the Microsoft Web site, so check here for updates.
The terms DFS and Distributed File System refer to the legacy namespace product available in Windows 2000, Windows 2003 and Windows 2003 R2, and as a legacy product in Windows 2008. DFS used the problematic File Replication Service (FRS) for the replication engine.
With Windows 2003 R2, Microsoft introduced a new DFS namespace product along with a much-improved replication engine. Here the term “legacy DFS” refers to the legacy DFS in Windows Server 2000, Windows Server 2003 and Windows Server 2008. The new DFS namespace is called DFS-N, and the new replication engine is called DFS-Replication (DFS-R).
Legacy Windows Server 2003 DFS/FRS
The legacy DFS in Windows 2000 and Windows 2003 used a cumbersome and confusing administrator’s console and terminology. The FRS was also problematic. Windows 2003 attempted to mitigate some of those issues, but was unable to actually fix them. So Microsoft delivered a completely new replication engine, DFS-R, for Windows 2003 R2 and Windows 2008.
With Windows 2003 now out of mainstream support by Microsoft, you really need to migrate to the new DFS/DFS-R available in Windows 2003 R2 and Windows 2008, if you haven’t already. Here are some potential problems and best practices associated with the legacy DFS and FRS.
FRS detects changes via the New Technology File System (NTFS) journal. This is modified when a change is made to a file or folder in the file system. Unfortunately, FRS can’t detect whether or not that change requires replication.
Applications that scan files—like antivirus and disk defragmentation tools—typically modify the security descriptor of the files. This triggers a change in NTFS journal, which in turn triggers FRS to replicate the files even though no changes were made. Updates made to FRS in Windows 2003 minimized those problems, but did not fix them. They included:
- Suppressing excessive replication: When FRS determines that certain files are being replicated frequently, it logs an event and suppresses replication for those files. This prevents the staging areas from filling up and stopping FRS, but you could unwittingly delete valid files.
- FRS over-filling staging area: When the staging area gets to 90 percent full, old files are deleted until the directory is only 60 percent full. While this prevents FRS shutdown, it might delete needed updates.
- **Preventing seeding data:**FRS can make it impossible to proactively seed data on multiple servers to avoid replicating large amounts of data over the WAN. The workaround is to copy small amounts of data until you’ve copied it all.
The best practices for managing and using legacy DFS/FRS revolve around the central concept that keeping dynamically changing data on DFS shares is inherently a bad idea. FRS is easily overwhelmed with large numbers of files. It also has a hard time replicating frequently changing data. For example, you shouldn’t use it to house My Documents for user profiles.
Other best practices for handling legacy DFS/FRS include:
- When initiating data on DFS shares for a series of target servers, seed the data on a single share and let it replicate. Do this in small quantities. Adding large numbers of files in multiple shares at the same time will make it difficult for FRS to catch up. If the data exists on multiple DFS servers, add and replicate data from one server at a time. After the initial seeding, FRS then only has to replicate changes.
- Ensure that your antivirus, defragmentation and other programs that scan files and folders are “FRS-aware.” Most well-known programs have this feature, which prevents unnecessary replication of files due to scanning.
- Create multiple root targets on multiple machines for data redundancy. Root targets contain configuration data.
- Provide data redundancy by creating multiple targets for DFS links. This ensures the same data continuously replicates to multiple targets. If one server is down, users will be directed to use another. DFS uses the “client awareness” feature of Active Directory to locate DFS servers closest to the user.
- DFS data replication isn’t required, but is recommended for data redundancy. Without replication, DFS provides only a common namespace for the shares.
- Do not host DFS shares on domain controllers (DCs). Because SYSVOL uses DFS on DCs, it’s easier to isolate replication issues if there aren’t SYSVOL and DFS shares on the same server. SYSVOL uses the DFS service and you can’t disable it on DCs. The point here is to not host DFS links or root targets on DCs.
- Configure one-way FRS replication between link targets in a hub-and-spoke configuration for controlling and managing data. Data created on spoke targets won’t replicate to the hub.
FRS and Legacy DFS Limitations
FRS replicates the entire file, even if only a few bytes have changed. There’s an approximate limit of 65GB per share that DFS/FRS can effectively replicate. Exceeding this limit will result in inconsistency and poor performance. Other noted limitations include:
- You can have only one DFS root per standard Windows Server 2003. There’s no limit with the Enterprise version. DFS service startup time increases with number of DFS roots.
- There’s a limit of 5,000 links per domain-based DFS namespace. More links will cause performance degradation when you make changes to the DFS configuration.
- There’s a limit of 260 characters in the DFS path. Exceeding this will prevent applications from accessing DFS data. You can access data by explicitly mapping to a drive letter.
- You can’t configure domain-based DFS on clustered nodes—use standalone DFSes only.
For multiple-domain DFS configurations:
- Root targets for a domain-based DFS root must be in the same domain. However, link targets can exist in other domains.
- Clients can access DFS servers in trusted domains
- When accessing link targets in other domains from the client, use Fully Qualified Domain Names (FQDNs) for link targets (see Microsoft Knowledge Base article 244380 for more information).
- FRS can be used to replicate on a DFS link whose targets are in different (trusted) domains. (This requires enterprise admin rights.)
For further reference, see the DFS FAQ.
Windows Server 2003 R2 and Windows Server 2008 DFS-N and DFS-R
The new DFS-N and DFS-R in Windows 2003 R2, Windows Server 2008 and Windows Server 2008 R2 have significant improvements over the legacy DFS and FRS products. DFS-R replicates on a block-level basis, only replicating changes made to a file, rather than the whole file.
For example, if you changed a title on a PowerPoint slide and the file is 3MB, FRS would replicate the entire 3MB file for the old legacy DFS. DFS-R only replicates a few bytes. This makes a huge difference for both network and disk performance. It also helps with user-perceived performance of getting changes replicated. DFS-R can handle large amounts of data, and dynamically and efficiently change that data.
DFS-R is available only in Windows Server 2003 R2 and Windows Server 2008. You can only use it to replicate DFS data in Windows Server 2003 R2, but you can replicate DFS and SYSVOL data in Windows Server 2008 and Windows Server 2008 R2. To use DFS-R for replication, only your DFS servers must be running Windows Server 2003 R2, Windows Server 2008 or Windows Server 2008 R2. You won’t have to upgrade the DCs.
Installing the new DFS/DFS-R in a Windows Server 2003 domain will require a schema change (see the list of DFS-R FAQs for more details):
- The schema change required to install the new DFS/DFS-R in a Windows Server 2003 domain will likely require some level of approval, so plan ahead.
- You can effectively use replication groups to replicate data from branch sites to file servers in the hub site, where you can easily store it on large SAN disks. In this type of scenario, make sure the new data is only added at the remote site. If an existing file is modified at the core (hub) site, it will replicate back to the remote sites and overwrite the file there.
- Take advantage of DFS-R for SYSVOL replication in Windows Server 2008 and Windows Server 2008 R2, especially in large domains with numerous Group Policies deployed. This requires migration, as FRS is the default replication engine for Windows Server 2008 domains.
- Refer to the TechNet blog by the Microsoft Directory Services team, “DFS-R SYSVOL Migration FAQ for instructions and tips on migrating SYSVOL to DFS-R,” for more details.
Apply the 972105, 969688, 978326, 959114, 978994 hotfixes prior to SYSVOL migration to DFS-R. Then proceed as follows:
- Migrate legacy DFS shares to DFS-N and DFS-R as Windows Server 2008 R2 begins to deprecate legacy DFS and FRS. Both will eventually go away.
- Design the replication topology for replication groups prior to deployment. There are many options for DFS-R topology that weren’t available in DFS/FRS. Ensure the replication method suits your file-deployment design.
- Monitor the state of DFS-R replication. System Center Operations Manager has a management pack for DFS replication monitoring. There may be third-party tools as well. The old Ultrasound and Sonar tools don’t work with DFS-R.
DFS-R provides more robust and efficient replication and handles dynamic data quite well, but it’s important to understand the scalability limitations for DFS-R when planning a DFS infrastructure. You can define replication groups independently of DFS namespace configuration—one is not dependent on the other. You will be, however, subject to the following limitations:
- Each server can be a member of up to 256 replication groups.
- Each replication group can contain up to 256 replicated folders.
- Each server can have up to 256 connections (for example, 128 incoming connections and 128 outgoing connections).
- On each server, the number of replication groups multiplied by the number of replicated folders multiplied by the number of simultaneously active connections must be 1,024 or fewer.
- A replication group can contain up to 256 members.
- A volume can contain up to 8 million replicated files, and a server can contain up to 1TB of replicated files.
- The maximum tested file size is 64GB.
- DFS-R can’t communicate with FRS.
For more details, see the Microsoft TechCenter page on this issue. There’s also an excellent list of FAQs.
Overall, the recommendation is simple: Get off of FRS—seriously. It’s old technology that Microsoft threw in the dumpster years ago. Bite the bullet and migrate all DFS shares (Windows Server 2003 R2 and newer) and SYSVOL replicas (Windows Server 2008 and newer) to DFS-R.
Take advantage of the robust performance improvements and spend your time doing more productive things. With the deprecation of legacy DFS and FRS in Windows Server 2008 R2, Microsoft is sending a message that it’s time to move to better technology. There are really no downsides.
Gary L. Olsen is a systems software engineer in the Hewlett-Packard Co. Worldwide Technical Expert Center for HP Services in Atlanta, Ga. He’s worked in the IT industry since 1981. Olsen is a Microsoft MVP for Directory Services and president of the Atlanta Active Directory Users Group. He’s the author of “Windows 2000: Active Directory Design and Deployment” (New Riders, 2000) and coauthor of “Windows 2003 on HP ProLiant Servers” (Prentice Hall, 2004).