Редактиране

Споделяне чрез


Configure Azure Cosmos DB for NoSQL in a copy activity

This article outlines how to use the copy activity in a data pipeline to copy data from and to Azure Cosmos DB for NoSQL.

Supported configuration

For the configuration of each tab under the copy activity, go to the following sections respectively.

General

Refer to the General settings guidance to configure the General settings tab.

Source

The following properties are supported for Azure Cosmos DB for NoSQL under the Source tab of a copy activity.

Screenshot showing source tab.

The following three properties are required:

  • Data store type: Select External.
  • Connection: Select an Azure Cosmos DB for NoSQL connection from the connection list. If no connection exists, then create a new Azure Cosmos DB for NoSQL connection by selecting New.
  • Container: Select the container that you want to use. Select Edit to enter the container name manually.

Under Advanced, you can specify the following fields:

  • Use query: You can choose either Table or Query as your use query. The following list describes the configuration of each setting.

    • Table: Reads data from the table you specified in Table.

    • Query: Specifies the Azure Cosmos DB query to read data.

      Screenshot showing query.

  • Page size: The number of documents per page of the query result. Default is "-1", which means to use the service side dynamic page size up to 1000.

  • Detect datetime: Whether to detect datetime from the string values in the documents. Allowed values are: true (default), false.

  • Preferred regions: The preferred list of regions to connect to when retrieving data from Azure Cosmos DB. Select one preferred region from the drop-down list after selecting New.

    Screenshot showing preferred regions.

  • Additional columns: Add additional data columns to store source files' relative path or static value. Expression is supported for the latter. For more information, go to Add additional columns during copy.

    Screenshot showing additional columns.

Destination

The following properties are supported for Azure Cosmos DB for NoSQL under the Destination tab of a copy activity.

Screenshot showing destination tab.

The following three properties are required:

  • Data store type: Select External.
  • Connection: Select an Azure Cosmos DB for NoSQL connection from the connection list.
  • Container: Select Browse to choose the file that you want to copy, or fill in the path manually.

Under Advanced, you can specify the following fields:

  • Write behavior: Defines the write behavior when the destination is files from a file-based data store. You can choose Add dynamic content, Insert, or Upsert from the drop-down list.

    • Add dynamic content: Open the Add dynamic content pane. This opens the expression builder where you can build expressions from supported system variables, activity output, functions, and user-specified variables or parameters. For information about the expression language, go to Expressions and functions.

    • Insert: Choose this option if your source data has inserts.

    • Upsert: The behavior of upsert is to replace the document if a document with the same ID already exists; otherwise, insert the document.

      Screenshot showing write behavior.

  • Write batch timeout: Wait time for the batch insert operation to complete before it times out. Allowed values are Timespan. An example is 00:30:00 (30 minutes).

  • Write batch size: Specify the number of rows to insert into the SQL table per batch. The allowed value is an integer (number of rows). By default, the service dynamically determines the appropriate batch size based on the row size.

  • Max concurrent connections: The upper limit of concurrent connections established to the data store during the activity run. Specify a value only when you want to limit concurrent connections.

  • Disable performance metrics analytics: This setting controls the collection of metrics, such as DTU, DWU, RU, and so on for copy performance optimization and recommendations. If you're concerned with this behavior, turn off this feature.

Mapping

For the Mapping tab configuration, go to Configure your mappings under mapping tab.

Settings

For the Settings tab configuration, go to Configure your other settings under settings tab.

Table summary

The following tables contain more information about the copy activity in Azure Cosmos DB for NoSQL.

Source table

Name Description Value Required JSON script property
Data store type Your data store type. External Yes /
Connection Your connection to the source data store. <your connection> Yes connection
Container The container of your source data. <container of your source> Yes container
fileName
Use query You can choose Table or Query as your use query. Table
Query
No type
Page size The number of documents per page of the query result. Default is "-1", which means to use the service side dynamic page size up to 1000. <your Page size> No pageSize
Delete datetime The files on source data store will be deleted right after being moved to the destination store. The file deletion is per file, so when copy activity fails, you'll note that some files have already been copied to the destination and deleted from source while others are still on source store. Selected or unselect No detectDatetime
Preferred regions The preferred list of regions to connect to when retrieving data from Azure Cosmos DB. Select one preferred region from the drop-down list after selecting New. <your preferred regions> No preferredRegions
Additional columns Add additional data columns to store source files' relative path or static value. Expression is supported for the latter. For more information, go to Add additional columns during copy. <max concurrent connections> No additionalColumns

Destination table

Name Description Value Required JSON script property
Data store type Your data store type. External Yes /
Connection Your connection to the destination data store. <your connection> Yes connection
Container The container of your destination data. <container of source> Yes container
fileName
Write behavior Describes how to write data to Azure Cosmos DB. Allowed values: insert and upsert. The behavior of upsert is to replace the document if a document with the same ID already exists; otherwise, insert the document. Add dynamic content
Insert
Upsert
No writeBehavior
Write batch timeout Wait time for the batch insert operation to complete before it times out. Allowed values are Timespan. An example is 00:30:00 (30 minutes). timespan No writeBatchTimeout
Write batch size The number of rows to insert into the SQL table per batch. The allowed value is integer (number of rows). By default, the service dynamically determines the appropriate batch size based on the row size. <number of rows >
(integer)
No writeBatchSize
Max concurrent connections The upper limit of concurrent connections established to the data store during the activity run. Specify a value only when you want to limit concurrent connections. <max concurrent connections> No maxConcurrentConnections
Disable performance metrics analytics This setting controls collection of metrics such as DTU, DWU, RU, and so on for copy performance optimization and recommendations. If you're concerned with this behavior, turn off this feature. Selected or unselect No disableMetricsCollection