Core component of SQL Server for storing, processing, and securing data
Database replication in SQL Server is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing them to maintain consistency.
Key concepts
- Purpose of replication
- Distribute data to different locations (branch offices, remote sites, mobile users).
- Improve scalability and availability by offloading reporting or read workloads.
- Integrate data from multiple sites or heterogeneous systems.
- Support occasionally connected or low-bandwidth scenarios.
- Core roles
- Publisher: The database that makes data available for replication.
- Distributor: The server/database that stores replication metadata and history, and manages the flow of data from Publisher to Subscribers.
- Subscriber: The database that receives replicated data.
- Main replication types
Transactional replication
- Typical use: Server-to-server scenarios that require high throughput and near real-time data movement.
- Data flow: Primarily one-way (Publisher → Subscriber).
- Common scenarios:
- Scale out read workloads (reporting subscribers).
- Offload batch processing.
- Data warehousing and reporting.
- Integrating data from multiple sites.
- Behavior:
- Changes at the Publisher are captured from the transaction log and applied to Subscribers in order.
- Subscribers are usually read-only in classic configurations.
Merge replication
- Typical use: Mobile or distributed applications where Subscribers can be offline and make changes independently.
- Data flow: Bi-directional; both Publisher and Subscribers can change data.
- Common scenarios:
- Mobile users synchronizing when connected.
- Point-of-sale (POS) systems.
- Multi-site environments with possible data conflicts.
- Behavior:
- Each node tracks changes.
- During synchronization, changes are merged and conflicts are detected and resolved according to defined rules.
Snapshot replication
- Typical use: When data changes infrequently, or when a complete refresh is acceptable.
- Data flow: Periodic full copies of data.
- Common scenarios:
- Initial data load for transactional or merge replication.
- Small reference tables that can be fully refreshed.
- Behavior:
- Takes a point-in-time snapshot of data and schema and applies it to Subscribers.
- Bi-directional and multi-site considerations
- Standard transactional replication is one-way. For true bi-directional updates between writable nodes, designs such as bi-directional transactional replication or merge replication are used.
- Bi-directional or multi-master designs require careful planning for:
- Conflict detection and resolution.
- Identity range management.
- Application logic that can tolerate eventual consistency.
- Replication with other SQL Server features
Replication and database mirroring
- Mirroring keeps a second copy of a database by replaying transaction logs from the principal to the mirror.
- Replication can be combined with mirroring for higher availability of a publication database, with these key points:
- The principal and mirror must share a Distributor; a remote Distributor is recommended for better fault tolerance.
- Supported scenarios:
- Mirroring the publication database for merge replication.
- Mirroring the publication database for transactional replication with read-only Subscribers or queued updating Subscribers.
- Not supported:
- Immediate updating Subscribers.
- Oracle Publishers.
- Peer-to-peer topologies.
- Republishing.
- The distribution database itself cannot be mirrored.
- Objects outside the database (logins, jobs, linked servers, etc.) are not mirrored and must be copied manually if needed on the mirror.
- Replication security
- Replication uses SQL Server security principals and roles to control access.
- Typical tasks include:
- Viewing and modifying replication security settings.
- Managing logins in the Publication Access List (PAL) to control who can subscribe.
- Configuration Manager database replication (different but related concept)
- In Microsoft Configuration Manager (ConfigMgr), “database replication” refers to how site databases share data using SQL Server features:
- Uses SQL Server change tracking and SQL Server Service Broker (SSB) over TCP port 4022 by default.
- Automatically established between sites in a hierarchy (e.g., CAS and primary sites).
- Data is grouped into replication groups with different schedules and priorities.
- Data types:
- Global data: Admin-created objects (deployments, updates, collections, RBAC scopes) replicated to all sites.
- Site data: Operational data (hardware inventory, status messages, alerts, collection results) replicated to the CAS.
- Local data: Stays at the site; not replicated.
- Controls and tuning:
- Database replication links: Configure schedules, thresholds, and whether some data is accessed via distributed views instead of being replicated.
- Site database replication controls: SSB port, reinitialization thresholds, compression of replicated data.
- Database replication thresholds: Define when a link is reported as degraded or failed based on consecutive failed attempts per replication group.
- Monitoring: Use the Database Replication node and Replication Link Analyzer to view status and troubleshoot.
- When to choose replication vs other technologies
- Use SQL Server replication when:
- There is a need to selectively replicate tables or subsets of data.
- There is a requirement for disconnected or occasionally connected Subscribers.
- There is a need for multi-site data distribution with flexible topologies.
- Consider other technologies (e.g., Always On Availability Groups, log shipping, mirroring) when:
- The primary goal is database-level high availability and disaster recovery rather than data distribution.
References: