Events
31 Mar, 11 pm - 2 Apr, 11 pm
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
APPLIES TO:
Azure Data Factory
Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
This article outlines how to use Copy Activity in Azure Data Factory and Synapse Analytics pipelines to copy data from and to Azure Cosmos DB for MongoDB. The article builds on Copy Activity, which presents a general overview of Copy Activity.
Note
This connector only supports copy data to/from Azure Cosmos DB for MongoDB. For Azure Cosmos DB for NoSQL, refer to the Azure Cosmos DB for NoSQL connector. Other API types are not currently supported.
This Azure Cosmos DB for MongoDB connector is supported for the following capabilities:
Supported capabilities | IR | Managed private endpoint |
---|---|---|
Copy activity (source/sink) | ① ② | ✓ |
① Azure integration runtime ② Self-hosted integration runtime
You can copy data from Azure Cosmos DB for MongoDB to any supported sink data store, or copy data from any supported source data store to Azure Cosmos DB for MongoDB. For a list of data stores that Copy Activity supports as sources and sinks, see Supported data stores and formats.
You can use the Azure Cosmos DB for MongoDB connector to:
To perform the Copy activity with a pipeline, you can use one of the following tools or SDKs:
Use the following steps to create a linked service to Azure Cosmos DB for MongoDB in the Azure portal UI.
Browse to the Manage tab in your Azure Data Factory or Synapse workspace and select Linked Services, then click New:
Search for Azure Cosmos DB for MongoDB and select that connector.
Configure the service details, test the connection, and create the new linked service.
The following sections provide details about properties you can use to define Data Factory entities that are specific to Azure Cosmos DB for MongoDB.
The following properties are supported for the Azure Cosmos DB for MongoDB linked service:
Property | Description | Required |
---|---|---|
type | The type property must be set to CosmosDbMongoDbApi. | Yes |
connectionString | Specify the connection string for your Azure Cosmos DB for MongoDB. You can find it in the Azure portal -> your Azure Cosmos DB blade -> primary or secondary connection string. For 3.2 server version, the string pattern is mongodb://<cosmosdb-name>:<password>@<cosmosdb-name>.documents.azure.com:10255/?ssl=true&replicaSet=globaldb . For 3.6+ server versions, the string pattern is mongodb://<cosmosdb-name>:<password>@<cosmosdb-name>.mongo.cosmos.azure.com:10255/?ssl=true&replicaSet=globaldb&retrywrites=false&maxIdleTimeMS=120000&appName=@<cosmosdb-name>@ .You can also put a password in Azure Key Vault and pull the password configuration out of the connection string. Refer to Store credentials in Azure Key Vault with more details. |
Yes |
database | Name of the database that you want to access. | Yes |
isServerVersionAbove32 | Specify whether the server version is above 3.2. Allowed values are true and false(default). This will determine the driver to use in the service. | Yes |
connectVia | The Integration Runtime to use to connect to the data store. You can use the Azure Integration Runtime or a self-hosted integration runtime (if your data store is located in a private network). If this property isn't specified, the default Azure Integration Runtime is used. | No |
Example
{
"name": "CosmosDbMongoDBAPILinkedService",
"properties": {
"type": "CosmosDbMongoDbApi",
"typeProperties": {
"connectionString": "mongodb://<cosmosdb-name>:<password>@<cosmosdb-name>.documents.azure.com:10255/?ssl=true&replicaSet=globaldb",
"database": "myDatabase",
"isServerVersionAbove32": "false"
},
"connectVia": {
"referenceName": "<name of Integration Runtime>",
"type": "IntegrationRuntimeReference"
}
}
}
For a full list of sections and properties that are available for defining datasets, see Datasets and linked services. The following properties are supported for Azure Cosmos DB for MongoDB dataset:
Property | Description | Required |
---|---|---|
type | The type property of the dataset must be set to CosmosDbMongoDbApiCollection. | Yes |
collectionName | The name of the Azure Cosmos DB collection. | Yes |
Example
{
"name": "CosmosDbMongoDBAPIDataset",
"properties": {
"type": "CosmosDbMongoDbApiCollection",
"typeProperties": {
"collectionName": "<collection name>"
},
"schema": [],
"linkedServiceName":{
"referenceName": "<Azure Cosmos DB for MongoDB linked service name>",
"type": "LinkedServiceReference"
}
}
}
This section provides a list of properties that the Azure Cosmos DB for MongoDB source and sink support.
For a full list of sections and properties that are available for defining activities, see Pipelines.
The following properties are supported in the Copy Activity source section:
Property | Description | Required |
---|---|---|
type | The type property of the copy activity source must be set to CosmosDbMongoDbApiSource. | Yes |
filter | Specifies selection filter using query operators. To return all documents in a collection, omit this parameter or pass an empty document ({}). | No |
cursorMethods.project | Specifies the fields to return in the documents for projection. To return all fields in the matching documents, omit this parameter. | No |
cursorMethods.sort | Specifies the order in which the query returns matching documents. Refer to cursor.sort(). | No |
cursorMethods.limit | Specifies the maximum number of documents the server returns. Refer to cursor.limit(). | No |
cursorMethods.skip | Specifies the number of documents to skip and from where MongoDB begins to return results. Refer to cursor.skip(). | No |
batchSize | Specifies the number of documents to return in each batch of the response from MongoDB instance. In most cases, modifying the batch size will not affect the user or the application. Azure Cosmos DB limits each batch cannot exceed 40MB in size, which is the sum of the batchSize number of documents' size, so decrease this value if your document size being large. | No (the default is 100) |
Tip
ADF support consuming BSON document in Strict mode. Make sure your filter query is in Strict mode instead of Shell mode. More description can be found in the MongoDB manual.
Example
"activities":[
{
"name": "CopyFromCosmosDBMongoDBAPI",
"type": "Copy",
"inputs": [
{
"referenceName": "<Azure Cosmos DB for MongoDB input dataset name>",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "<output dataset name>",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "CosmosDbMongoDbApiSource",
"filter": "{datetimeData: {$gte: ISODate(\"2018-12-11T00:00:00.000Z\"),$lt: ISODate(\"2018-12-12T00:00:00.000Z\")}, _id: ObjectId(\"5acd7c3d0000000000000000\") }",
"cursorMethods": {
"project": "{ _id : 1, name : 1, age: 1, datetimeData: 1 }",
"sort": "{ age : 1 }",
"skip": 3,
"limit": 3
}
},
"sink": {
"type": "<sink type>"
}
}
}
]
The following properties are supported in the Copy Activity sink section:
Property | Description | Required |
---|---|---|
type | The type property of the Copy Activity sink must be set to CosmosDbMongoDbApiSink. | Yes |
writeBehavior | Describes how to write data to Azure Cosmos DB. Allowed values: insert and upsert. The behavior of upsert is to replace the document if a document with the same _id already exists; otherwise, insert the document.Note: The service automatically generates an _id for a document if an _id isn't specified either in the original document or by column mapping. This means that you must ensure that, for upsert to work as expected, your document has an ID. |
No (the default is insert) |
writeBatchSize | The writeBatchSize property controls the size of documents to write in each batch. You can try increasing the value for writeBatchSize to improve performance and decreasing the value if your document size being large. | No (the default is 10,000) |
writeBatchTimeout | The wait time for the batch insert operation to finish before it times out. The allowed value is timespan. | No (the default is 00:30:00 - 30 minutes) |
Tip
To import JSON documents as-is, refer to Import or export JSON documents section; to copy from tabular-shaped data, refer to Schema mapping.
Example
"activities":[
{
"name": "CopyToCosmosDBMongoDBAPI",
"type": "Copy",
"inputs": [
{
"referenceName": "<input dataset name>",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "<Document DB output dataset name>",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "<source type>"
},
"sink": {
"type": "CosmosDbMongoDbApiSink",
"writeBehavior": "upsert"
}
}
}
]
You can use this Azure Cosmos DB connector to easily:
To achieve schema-agnostic copy:
To copy data from Azure Cosmos DB for MongoDB to tabular sink or reversed, refer to schema mapping.
Specifically for writing into Azure Cosmos DB, to make sure you populate Azure Cosmos DB with the right object ID from your source data, for example, you have an "id" column in SQL database table and want to use the value of that as the document ID in MongoDB for insert/upsert, you need to set the proper schema mapping according to MongoDB strict mode definition (_id.$oid
) as the following:
After copy activity execution, below BSON ObjectId is generated in sink:
{
"_id": ObjectId("592e07800000000000000000")
}
For a list of data stores that Copy Activity supports as sources and sinks, see supported data stores.
Events
31 Mar, 11 pm - 2 Apr, 11 pm
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayTraining
Certification
Microsoft Certified: Azure Cosmos DB Developer Specialty - Certifications
Write efficient queries, create indexing policies, manage, and provision resources in the SQL API and SDK with Microsoft Azure Cosmos DB.
Documentation
Copy and transform data in Azure Cosmos DB for NoSQL - Azure Data Factory & Azure Synapse
Learn how to copy data to and from Azure Cosmos DB for NoSQL, and transform data in Azure Cosmos DB for NoSQL using Azure Data Factory and Azure Synapse Analytics.
Copy and transform data in Azure Cosmos DB analytical store - Azure Data Factory & Azure Synapse
Learn how to transform data in Azure Cosmos DB analytical store using Azure Data Factory and Azure Synapse Analytics.
Copy data from or to MongoDB - Azure Data Factory & Azure Synapse
Learn how to copy data from MongoDB to supported sink data stores, or from supported source data stores to MongoDB, using a copy activity in an Azure Data Factory or Synapse Analytics pipeline.