Quickstart: Azure Cosmos DB for MongoDB for Python with MongoDB driver

APPLIES TO: MongoDB

Get started with MongoDB to create databases, collections, and docs within your Azure Cosmos DB resource. Follow these steps to deploy a minimal solution to your environment using the Azure Developer CLI.

API for MongoDB reference documentation | pymongo package | Azure Developer CLI

Prerequisites

Setting up

Deploy this project's development container to your environment. Then, use the Azure Developer CLI (azd) to create an Azure Cosmos DB for MongoDB account and deploy a containerized sample application. The sample application uses the client library to manage, create, read, and query sample data.

Open in GitHub Codespaces

Open in Dev Container

Important

GitHub accounts include an entitlement of storage and core hours at no cost. For more information, see included storage and core hours for GitHub accounts.

  1. Open a terminal in the root directory of the project.

  2. Authenticate to the Azure Developer CLI using azd auth login. Follow the steps specified by the tool to authenticate to the CLI using your preferred Azure credentials.

    azd auth login
    
  3. Use azd init to initialize the project.

    azd init --template cosmos-db-mongodb-python-quickstart
    

    Note

    This quickstart uses the azure-samples/cosmos-db-mongodb-python-quickstart template GitHub repository. The Azure Developer CLI automatically clones this project to your machine if it is not already there.

  4. During initialization, configure a unique environment name.

    Tip

    The environment name will also be used as the target resource group name. For this quickstart, consider using msdocs-cosmos-db.

  5. Deploy the Azure Cosmos DB account using azd up. The Bicep templates also deploy a sample web application.

    azd up
    
  6. During the provisioning process, select your subscription and desired location. Wait for the provisioning process to complete. The process can take approximately five minutes.

  7. Once the provisioning of your Azure resources is done, a URL to the running web application is included in the output.

    Deploying services (azd deploy)
    
      (✓) Done: Deploying service web
    - Endpoint: <https://[container-app-sub-domain].azurecontainerapps.io>
    
    SUCCESS: Your application was provisioned and deployed to Azure in 5 minutes 0 seconds.
    
  8. Use the URL in the console to navigate to your web application in the browser. Observe the output of the running app.

    Screenshot of the running web application.


Install the client library

  1. Create a requirements.txt file in your app directory that lists the PyMongo and python-dotenv packages.

    # requirements.txt
    pymongo
    python-dotenv
    
  2. Create a virtual environment and install the packages.

    # py -3 uses the global python interpreter. You can also use python3 -m venv .venv.
    py -3 -m venv .venv
    source .venv/Scripts/activate   
    pip install -r requirements.txt
    

Object model

Let's look at the hierarchy of resources in the API for MongoDB and the object model that's used to create and access these resources. The Azure Cosmos DB creates resources in a hierarchy that consists of accounts, databases, collections, and documents.

Diagram of the Azure Cosmos DB hierarchy including accounts, databases, collections, and docs.

Hierarchical diagram showing an Azure Cosmos DB account at the top. The account has two child database shards. One of the database shards includes two child collection shards. The other database shard includes a single child collection shard. That single collection shard has three child doc shards.

Each type of resource is represented by a Python class. Here are the most common classes:

  • MongoClient - The first step when working with PyMongo is to create a MongoClient to connect to Azure Cosmos DB's API for MongoDB. The client object is used to configure and execute requests against the service.

  • Database - Azure Cosmos DB's API for MongoDB can support one or more independent databases.

  • Collection - A database can contain one or more collections. A collection is a group of documents stored in MongoDB, and can be thought of as roughly the equivalent of a table in a relational database.

  • Document - A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection don't need to have the same set of fields or structure. And common fields in a collection's documents may hold different types of data.

To learn more about the hierarchy of entities, see the Azure Cosmos DB resource model article.

Code examples

The sample code described in this article creates a database named adventureworks with a collection named products. The products collection is designed to contain product details such as name, category, quantity, and a sale indicator. Each product also contains a unique identifier. The complete sample code is at https://github.com/Azure-Samples/azure-cosmos-db-mongodb-python-getting-started/tree/main/001-quickstart/.

For the steps below, the database won't use sharding and shows a synchronous application using the PyMongo driver. For asynchronous applications, use the Motor driver.

Authenticate the client

  1. In the project directory, create an run.py file. In your editor, add require statements to reference packages you'll use, including the PyMongo and python-dotenv packages.

    import os
    import sys
    from random import randint
    
    import pymongo
    from dotenv import load_dotenv
    
  2. Get the connection information from the environment variable defined in an .env file.

    load_dotenv()
    CONNECTION_STRING = os.environ.get("COSMOS_CONNECTION_STRING")
    
  3. Define constants you'll use in the code.

    DB_NAME = "adventureworks"
    COLLECTION_NAME = "products"
    

Connect to Azure Cosmos DB's API for MongoDB

Use the MongoClient object to connect to your Azure Cosmos DB for MongoDB resource. The connect method returns a reference to the database.

client = pymongo.MongoClient(CONNECTION_STRING)

Get database

Check if the database exists with list_database_names method. If the database doesn't exist, use the create database extension command to create it with a specified provisioned throughput.

# Create database if it doesn't exist
db = client[DB_NAME]
if DB_NAME not in client.list_database_names():
    # Create a database with 400 RU throughput that can be shared across
    # the DB's collections
    db.command({"customAction": "CreateDatabase", "offerThroughput": 400})
    print("Created db '{}' with shared throughput.\n".format(DB_NAME))
else:
    print("Using database: '{}'.\n".format(DB_NAME))

Get collection

Check if the collection exists with the list_collection_names method. If the collection doesn't exist, use the create collection extension command to create it.

# Create collection if it doesn't exist
collection = db[COLLECTION_NAME]
if COLLECTION_NAME not in db.list_collection_names():
    # Creates a unsharded collection that uses the DBs shared throughput
    db.command(
        {"customAction": "CreateCollection", "collection": COLLECTION_NAME}
    )
    print("Created collection '{}'.\n".format(COLLECTION_NAME))
else:
    print("Using collection: '{}'.\n".format(COLLECTION_NAME))

Create an index

Create an index using the update collection extension command. You can also set the index in the create collection extension command. Set the index to name property in this example so that you can later sort with the cursor class sort method on product name.

indexes = [
    {"key": {"_id": 1}, "name": "_id_1"},
    {"key": {"name": 2}, "name": "_id_2"},
]
db.command(
    {
        "customAction": "UpdateCollection",
        "collection": COLLECTION_NAME,
        "indexes": indexes,
    }
)
print("Indexes are: {}\n".format(sorted(collection.index_information())))

Create a document

Create a document with the product properties for the adventureworks database:

  • A category property. This property can be used as the logical partition key.
  • A name property.
  • An inventory quantity property.
  • A sale property, indicating whether the product is on sale.
"""Create new document and upsert (create or replace) to collection"""
product = {
    "category": "gear-surf-surfboards",
    "name": "Yamba Surfboard-{}".format(randint(50, 5000)),
    "quantity": 1,
    "sale": False,
}
result = collection.update_one(
    {"name": product["name"]}, {"$set": product}, upsert=True
)
print("Upserted document with _id {}\n".format(result.upserted_id))

Create a document in the collection by calling the collection level operation update_one. In this example, you'll upsert instead of create a new document. Upsert isn't necessary in this example because the product name is random. However, it's a good practice to upsert in case you run the code more than once and the product name is the same.

The result of the update_one operation contains the _id field value that you can use in subsequent operations. The _id property was created automatically.

Get a document

Use the find_one method to get a document.

doc = collection.find_one({"_id": result.upserted_id})
print("Found a document with _id {}: {}\n".format(result.upserted_id, doc))

In Azure Cosmos DB, you can perform a less-expensive point read operation by using both the unique identifier (_id) and a partition key.

Query documents

After you insert a doc, you can run a query to get all docs that match a specific filter. This example finds all docs that match a specific category: gear-surf-surfboards. Once the query is defined, call Collection.find to get a Cursor result, and then use sort.

"""Query for documents in the collection"""
print("Products with category 'gear-surf-surfboards':\n")
allProductsQuery = {"category": "gear-surf-surfboards"}
for doc in collection.find(allProductsQuery).sort(
    "name", pymongo.ASCENDING
):
    print("Found a product with _id {}: {}\n".format(doc["_id"], doc))

Troubleshooting:

  • If you get an error such as The index path corresponding to the specified order-by item is excluded., make sure you created the index.

Run the code

This app creates an API for MongoDB database and collection and creates a document and then reads the exact same document back. Finally, the example issues a query that returns documents that match a specified product category. With each step, the example outputs information to the console about the steps it has performed.

To run the app, use a terminal to navigate to the application directory and run the application.

python run.py

The output of the app should be similar to this example:


Created db 'adventureworks' with shared throughput.

Created collection 'products'.

Indexes are: ['_id_', 'name_1']

Upserted document with _id <ID>

Found a document with _id <ID>:
{'_id': <ID>,
'category': 'gear-surf-surfboards',
'name': 'Yamba Surfboard-50',
'quantity': 1,
'sale': False}

Products with category 'gear-surf-surfboards':

Found a product with _id <ID>:
{'_id': ObjectId('<ID>'),
'name': 'Yamba Surfboard-386',
'category': 'gear-surf-surfboards',
'quantity': 1,
'sale': False}

Clean up resources

When you no longer need the Azure Cosmos DB for NoSQL account, you can delete the corresponding resource group.

Use the az group delete command to delete the resource group.

az group delete --name $resourceGroupName