Programming language used to interact with SQL Server databases
Data partitioning in SQL Server (MSSQL) is primarily implemented through partitioned tables and indexes. It horizontally splits a large table or index into multiple partitions that can be stored on one or more filegroups, while still behaving as a single logical object to queries and updates.
Key concepts
- Partitioned tables and indexes
- Data is divided horizontally so that groups of rows are mapped into individual partitions.
- All partitions of a single table or index must reside in the same database.
- Even though data is physically split, the table or index is treated as a single logical entity for queries and updates.
- When multiple files exist in a filegroup, data within that filegroup is spread across files using the proportional fill algorithm.
- Partition function
- A partition function defines how rows are mapped to partitions based on a partitioning column.
- It specifies boundary points and a range type (RANGE LEFT or RANGE RIGHT).
- Example behavior (RANGE RIGHT):
- The leftmost partition has the minimum value of the data type as its lower limit.
- The rightmost partition has the maximum value of the data type as its upper limit.
- For a date column
datecolwith a RIGHT range, a boundary at2022-02-01means:- Partition 1:
datecol < 2022-02-01 12:00AM - Partition 2:
datecol >= 2022-02-01 12:00AMand< 2022-03-01 12:00AM
- Partition 1:
- More examples are available in
CREATE PARTITION FUNCTION.
- Partition scheme
- A partition scheme maps each partition defined by a partition function to one or more filegroups.
- It is a database object that determines where each partition physically resides.
- Reasons to use multiple filegroups in a partition scheme:
- Tiered storage: place older, less frequently accessed partitions on slower, cheaper storage; keep hot data on faster storage.
- Backup/restore flexibility: each filegroup can be backed up and restored independently, allowing:
- Skipping repeated backups of static partitions.
- Shorter restore times when only some partitions need to be restored.
- All other partitioning benefits apply regardless of whether one or multiple filegroups are used.
- Table and index metadata (SMO)
- In SQL Server Management Objects (SMO):
-
Databasecontains collections ofPartitionFunctionandPartitionSchemeobjects. - Each
TableandIndexspecifies its partition scheme via thePartitionSchemeproperty and the partitioning columns viaPartitionSchemeParameterCollection.
-
- This allows programmatic creation and management of partitioning in C#, PowerShell, etc.
- Example scenario (date-based partitioning)
- A common pattern is to partition a transactional table by date to support archiving and performance:
- Create a partition function that divides data by date ranges.
- Create a partition scheme that maps those partitions to filegroups.
- Apply the scheme to a table such as
TransactionHistory. - Older partitions can be switched out to an archive table (for example,
TransactionHistoryArchive).
- SMO examples show how to do this in Visual C# or PowerShell for the AdventureWorks sample database.
- Benefits of partitioning
Manageability
- Faster data movement and maintenance:
- Can transfer or access subsets of data quickly while maintaining integrity.
- Operations like loading data from OLTP to OLAP can be much faster when working at partition granularity.
- Targeted maintenance:
- Compress data in specific partitions.
- Rebuild indexes for one or more partitions instead of the whole index.
- Truncate data in a single partition.
- Switch partitions in and out of tables (for archiving or rolling windows).
Performance 3. Query performance improvements:
- The optimizer can use partition elimination when the query predicate includes the partitioning column.
- Equijoin queries between partitioned tables can be faster when they share the same partitioning columns.
- Concurrency and locking:
- Lock escalation can occur at the partition level instead of the table level.
- To enable this, set
LOCK_ESCALATION = AUTOon the table usingALTER TABLE. - This can reduce lock contention on large tables.
- Choosing a partitioning strategy
- Strategy depends on data distribution, growth, and query patterns.
- For ledger-like tables:
- Date-based partitioning is often used to exclude historical data from most queries.
- For tables with state columns (for example, status fields) and skewed distributions:
- Partitioning on the state field can be effective.
- The SQL optimizer relies on partition elimination, so the partition key should appear in WHERE predicates for best effect.
- Partitioning and Business Central (AL extensions)
- When using SQL Server partitioning under Business Central:
- Once an AL extension is published and synchronized, primary keys should not be changed; changing PKs is a breaking schema change.
- This limits partitioning options for Microsoft-provided extensions (system/base apps).
- Partitioning is primarily feasible for:
- New custom extensions designed with partitioning in mind.
- Existing extensions whose primary keys already include the intended partitioning fields.
- Therefore, when designing new tables, include partitioning fields in the primary key from the start if partitioning is anticipated.
- For CAL-based tables in C/SIDE (Business Central 14 and earlier), this constraint does not apply.
- Partitioning in Analysis Services (multidimensional)
- In SQL Server Analysis Services (multidimensional cubes):
- Every measure group has at least one partition.
- Additional partitions can be created to split cube data (for example, by year or quarter).
- Only the current partition (for example, current quarter) needs processing when new data arrives, improving processing performance.
- Partitions can later be merged (for example, quarterly partitions into a yearly partition).
- Partitions are not visible to end users but are managed by administrators.
- Each partition is stored in its own set of files and can reside on different drives or servers.
- Partition schemes in cubes:
- Multi-table partition scheme: each partition is based on a separate table (for example, one table per month).
- Single-table partition scheme: one table, but each partition uses a filtered query (for example, WHERE clause per month).
- Storage settings per partition control:
- Where data and aggregations are stored (local or remote Analysis Services instance).
- Storage mode and proactive caching.
- Operational considerations
- Partition count and design:
- Over-partitioning can add complexity without benefit; design partitions aligned with maintenance and query needs.
- Filegroup planning:
- Use multiple filegroups when tiered storage or filegroup-level backup/restore is required.
- Application design:
- Ensure queries include the partition key in predicates to benefit from partition elimination.
References: