Configure MongoDB Atlas in a copy activity
This article outlines how to use the copy activity in data pipeline to copy data from and to MongoDB Atlas.
For the configuration of each tab under copy activity, go to the following sections respectively.
Refer to the General settings guidance to configure the General settings tab.
Go to Source tab to configure your copy activity source. See the following content for the detailed configuration.
The following properties are required:
- Data store type: Select External.
- Connection: Select a MongoDB Atlas connection from the connection list. If no connection exists, then create a new MongoDB Atlas connection by selecting New.
- Database: Select your database from the drop-down list.
- Collection name: Specify the name of the collection in MongoDB Atlas database. You can select the collection from the drop-down list or select Edit to enter it manually.
Under Advanced, you can specify the following fields:
Filter: Specifies selection filter using query operators. To return all documents in a collection, omit this parameter or pass an empty document ({}).
Cursor methods: Select + New to specify the way that the underlying query is executed. The ways to execute query are:
- project: Specifies the fields to return in the documents for projection. To return all fields in the matching documents, omit this parameter.
- sort: Specifies the order in which the query returns matching documents. Refer to cursor.sort().
- limit: Specifies the maximum number of documents the server returns. Refer to cursor.limit().
- skip: Specifies the number of documents to skip and from where MongoDB Atlas begins to return results. Refer to cursor.skip().
Batch size: Specifies the number of documents to return in each batch of the response from MongoDB Atlas instance. In most cases, modifying the batch size will not affect the user or the application.
Additional columns: Add additional data columns to store source files' relative path or static value. Expression is supported for the latter.
Go to Destination tab to configure your copy activity destination. See the following content for the detailed configuration.
The following properties are required:
- Data store type: Select External.
- Connection: Select a MongoDB Atlas connection from the connection list. If no connection exists, then create a new MongoDB Atlas connection by selecting New.
- Database: Select your database from the drop-down list.
- Collection name: Specify the name of the collection in MongoDB Atlas database. You can select the collection from the drop-down list or select Edit to enter it manually.
Under Advanced, you can specify the following fields:
Write behavior: Describes how to write data to MongoDB Atlas. Allowed values: Insert and Upsert.
The behavior of Upsert is to replace the document if a document with the same
_id
already exists; otherwise, insert the document.Note
The service automatically generates an
_id
for a document if an_id
isn't specified either in the original document or by column mapping. This means that you must ensure that, for Upsert to work as expected, your document has an ID.Write batch timeout: Specify the wait time for the batch insert operation to finish before it times out. The allowed value is timespan.
Write batch size: This property controls the size of documents to write in each batch. You can try increasing the value to improve performance and decreasing the value if your document size being large.
For Mapping tab configuration, see Configure your mappings under mapping tab. Mapping is not supported when both source and destination are hierarchical data.
For Settings tab configuration, go to Configure your other settings under settings tab.
The following table contains more information about the copy activity in MongoDB Atlas.
Name | Description | Value | Required | JSON script property |
---|---|---|---|---|
Data store type | Your data store type. | External | Yes | / |
Connection | Your connection to the source data store. | < your MongoDB Atlas connection > | Yes | connection |
Database | Your database that you use as source. | < your database > | Yes | database |
Collection name | Name of the collection in MongoDB Atlas database. | < your collection > | Yes | collection |
Filter | The selection filter using query operators. To return all documents in a collection, omit this parameter or pass an empty document ({}). | < your selection filter > | No | filter |
Cursor methods | The way that the underlying query is executed. | • project • sort • limit • skip |
No | cursorMethods: • project • sort • limit • skip |
Batch size | The number of documents to return in each batch of the response from MongoDB Atlas instance. | < your write batch size > (the default is 100) |
No | batchSize |
Additional columns | Add additional data columns to store source files' relative path or static value. Expression is supported for the latter. | • Name • Value |
No | additionalColumns: • name • value |
Name | Description | Value | Required | JSON script property |
---|---|---|---|---|
Data store type | Your data store type. | External | Yes | / |
Connection | Your connection to the destination data store. | < your MongoDB Atlas connection > | Yes | connection |
Database | Your database that you use as destination. | < your database > | Yes | database |
Collection name | Name of the collection in MongoDB Atlas database. | < your collection > | Yes | collection |
Write behavior | Describes how to write data to MongoDB Atlas. Allowed values: Insert and Upsert. The behavior of Upsert is to replace the document if a document with the same _id already exists; otherwise, insert the document.Note: The service automatically generates an _id for a document if an _id isn't specified either in the original document or by column mapping. This means that you must ensure that, for Upsert to work as expected, your document has an ID. |
• Insert (default) • Upsert |
No | writeBehavior: • insert • upsert |
Write batch timeout | The wait time for the batch insert operation to finish before it times out. | timespan (the default is 00:30:00 - 30 minutes) |
No | writeBatchTimeout |
Write batch size | Controls the size of documents to write in each batch. You can try increasing this value to improve performance and decreasing the value if your document size being large. | < your write batch size > | No | writeBatchSize |