Elastic tables for developers
Dataverse elastic tables are powered by Azure Cosmos DB. They automatically scale horizontally to handle large amounts of data and high levels of throughput with low latency. Elastic tables are suitable for applications that have unpredictable, spiky, or rapidly growing workloads.
This article focuses on information that developers need to know about using elastic tables. For more information about the capabilities of elastic tables and what is supported, go to Create and edit elastic tables.
When to use elastic tables
The specific requirements of your data and your application determine whether you should use elastic tables or standard tables.
Use elastic tables in these situations:
- Your data might be unstructured or semi-structured, or your data model might constantly be changing.
- You need horizontal scaling to handle workload growth over time or bursty workload at a given point.
- You must handle a high volume of read and write requests.
Use standard tables in these situations:
- Your application requires strong data consistency.
- Your application requires relational modeling and needs transactional capability across tables or during plug-in execution.
- Your application requires complex joins.
A combination of elastic and standard tables might be appropriate for your data and your application.
For more information about differences in the modeling of elastic tables, go to:
Partitioning and horizontal scaling
Elastic tables use Azure Cosmos DB partitioning to scale individual tables to meet the performance requirements of your application. All elastic tables contain a system-defined Partition Id string column. This column has the schema name PartitionId
and the logical name partitionid
.
Azure Cosmos DB ensures that the rows in a table are divided into distinct subsets, based on the value of the partitionid
column for each row. These subsets are called logical partitions.
Important
To get the best performance that is available with elastic tables, you must choose and consistently apply a partitioning strategy. If you don't set a partitionid
value for each row, the value will remain null, and you won't be able to change it later.
If you use a custom partitionid
value, the primary key value doesn't have a unique constraint. In other words, you can create multiple records that have the same primary key but different partitionid
values. It's important to understand that unique references for elastic tables are a combination of the primary key and the partitionid
value.
Choosing a partitionid value
The partitionid
value that you should use depends on the nature of your data. A logical partition in an elastic table consists of a set of rows that have the same partitionid
value. For example, in a table that contains data about different products, you can use the product category as the partitionid
value for the table. In this case, groups of items that have specific values for the product category, such as Clothing
, Books
, Electronic Appliances
, and Pet supplies
, form distinct logical partitions.
Dataverse transparently and automatically manages logical partitions that are associated with a table. There is no limit on the number of logical partitions that you can have in a table. In addition, there is no risk that a logical partition will be deleted if its underlying rows are deleted.
For all elastic tables, the partitionid
column should meet these criteria:
- The values in it don't change. After a row is created that has a
partitionid
value, you can't change it. - It has a high cardinality value. In other words, the property should have a wide range of possible values. Each logical partition can store 20 gigabytes (GB) of data. Therefore, by choosing a
partitionid
value that has a wide range of possible values, you ensure that the table can scale without reaching limits for any specific logical partition. - It spreads data as evenly as possible among all logical partitions.
- No values are larger than 1,024 bytes.
- No values contain slashes (/), angle brackets (<, >), asterisks (*), percent signs (%), ampersands (&), colons (:), backslashes (\), question marks (?), or plus signs (+). These characters aren't supported for alternate keys.
If a partitionid
value isn't specified for a row, Dataverse uses the primary key value as the default partitionid
value. For write-heavy tables of any size, or for cases where rows are mostly retrieved by using the primary key, the primary key is a great choice for the partitionid
column.
Consistency level
Elastic table supports strong consistency during a logical session. A logical session is a connection between a client and Dataverse. When a client performs a write operation on an elastic table, it receives a session token that uniquely identifies the logical session. To have strong consistency, you must maintain the logical session context by including the session token with all subsequent requests.
Session tokens ensure that all the read operations that are performed during the same logical session context return the most recent write that was made during that logical session. In other words, session tokens ensure that reads always honor the read-your-writes and write-follows-reads guarantees during a logical session. If a different logical session performs a write operation, other logical sessions might not immediately detect those changes.
The session token is available as an x-ms-session-token
value in the response of all write operations. To retrieve the most up-to-date row, you must include this value when you retrieve data.
- If you're using the SDK, use the
SessionToken
optional parameter. - If you're using Web API, use the
MSCRM.SessionToken
request header.
If you retrieve a record without a session token, the recently applied changes might not be applied. Instead, they might be returned in subsequent requests.
Learn more about using the session token.
Transactional behavior
Elastic tables don't support multi-record transactions. For a single request execution, multiple write operations that occur during the same synchronous plug-in stage or during different synchronous plug-in stages aren't transactional with each other.
For example, you have a synchronous plug-in step that is registered on the PostOperation
stage of the Create
message on an elastic table. In this case, an error that occurs in the plug-in does not roll back the record that is created in Dataverse. You should always avoid intentionally canceling any operation by throwing InvalidPluginExecutionException in the PreOperation
or PostOperation
synchronous stage. If the error is thrown after the Main
operation, the request returns an error, but the data operation succeeds. Any write operations that are started in the PreOperation
stage succeed.
However, you should always apply validation rules in a plug-in that is registered for the PreValidation
synchronous stage. Validation is the purpose of this stage. Even when you use elastic tables, the request returns an error, and the data operation won't begin. Learn more about the event execution pipeline.
Elastic tables also don't support grouping requests in a single database transaction that uses the SDK ExecuteTransactionRequest class or in a Web API $batch
operation changeset. Currently, these operations succeed but aren't atomic. In the future, an error will be thrown.
Elastic tables don't support deep insert as standard tables do. You will get this error: Cannot create related entities. Create has to be called individually for each entity.
To learn more about multi-record transactions, go to:
- Execute messages in a single database transaction
- Change sets
- Web API deep insert
- SDK for .NET deep insert
- Known issue: No error is returned when elastic table data operations are grouped in a transaction
Expire data by using Time to live
Dataverse automatically includes a Time to live integer column with elastic tables. This column has the schema name TTLInSeconds
and the logical name ttlinseconds
.
When a value is set in this column, it defines the amount of time, in seconds, before the row expires and is automatically deleted from database. If no value is set, the record persists indefinitely, just like standard tables.
Scenario
The examples in related articles use this scenario.
Contoso operates a large number of Internet of Things (IoT) devices that the company has deployed all across the world. Contoso must store and query large amounts of sensor data that is emitted from the IoT devices so that it can monitor their health and gather other insights.
To store and query the large volume of IoT data, Contoso creates an elastic table that is named contoso_SensorData
. It uses a string column that is named contoso_DeviceId
as the partitionid
value for each row that corresponds to a device. Because each contoso_DeviceId
value is unique to a device, and Contoso performs queries mostly in context of a given contoso_DeviceId
value, it serves as a good partition strategy for the whole dataset.
Related articles:
- Create elastic tables using code
- Use elastic tables using code
- Query JSON columns in elastic tables
- Elastic table sample code
Known issues
The following known issues with elastic tables should be addressed before this feature becomes generally available.
No error is returned when elastic table data operations are grouped in a transaction
Dataverse doesn't return an error when you group data operations by using the SDK ExecuteTransactionRequest class or a Web API $batch
operation changeset. Although these data operations are completed, no transaction is applied. Because no transaction can be applied, these operations should fail and return an error.
No x-ms-session-token value is returned for delete operations
Dataverse doesn't return the x-ms-session-token
value for delete operations. Learn more about how this value is used to managed data consistency.
The partitionId optional parameter isn't available for all messages
Whenever a record that uses a custom partitionid
value must be identified, such as for Retrieve
, Update
, or Upsert
operations, you need a way to reference the partitionid
value. In this case, you can use the alternate key to reference the record. If you prefer, you should also be able to use the partitionId
optional parameter style. Currently, only Retrieve
and Delete
operations support using the partitionId
optional parameter. Learn more about using the partitionId parameter.
Records synchronized with Synapse Link for Dataverse not deleted from data lake when TTL expires
When time to live (TTL) is used on a row, the row will get deleted from the elastic table when TTL has expired. If it's synchronized to a data lake using Synapse Link for Dataverse before TTL expiry, it won't be deleted from the data lake.
Frequently asked questions
This section will include any frequently asked questions (FAQ). If you have a question that isn't addressed in the documentation, select the This Page button in the Feedback section at the bottom of the page. You must have a GitHub account to submit questions in this way.
Next steps
See also
Use elastic tables using code
Query JSON columns in elastic tables
Bulk operation messages
Elastic table sample code
Partitioning and horizontal scaling in Azure Cosmos DB