Quickstart: Azure Cosmos DB for Apache Gremlin library for Python
APPLIES TO: Gremlin
Azure Cosmos DB for Apache Gremlin is a fully managed graph database service implementing the popular Apache Tinkerpop
, a graph computing framework using the Gremlin query language. The API for Gremlin gives you a low-friction way to get started using Gremlin with a service that can grow and scale out as much as you need with minimal management.
In this quickstart, you use the gremlinpython
library to connect to a newly created Azure Cosmos DB for Gremlin account.
Library source code | Package (PyPi)
Prerequisites
- An Azure account with an active subscription.
- No Azure subscription? Sign up for a free Azure account.
- Don't want an Azure subscription? You can try Azure Cosmos DB free with no subscription required.
- Python (latest)
- Don't have Python installed? Try this quickstart in GitHub Codespaces.
- Azure Command-Line Interface (CLI)
Azure Cloud Shell
Azure hosts Azure Cloud Shell, an interactive shell environment that you can use through your browser. You can use either Bash or PowerShell with Cloud Shell to work with Azure services. You can use the Cloud Shell preinstalled commands to run the code in this article, without having to install anything on your local environment.
To start Azure Cloud Shell:
Option | Example/Link |
---|---|
Select Try It in the upper-right corner of a code or command block. Selecting Try It doesn't automatically copy the code or command to Cloud Shell. | |
Go to https://shell.azure.com, or select the Launch Cloud Shell button to open Cloud Shell in your browser. | |
Select the Cloud Shell button on the menu bar at the upper right in the Azure portal. |
To use Azure Cloud Shell:
Start Cloud Shell.
Select the Copy button on a code block (or command block) to copy the code or command.
Paste the code or command into the Cloud Shell session by selecting Ctrl+Shift+V on Windows and Linux, or by selecting Cmd+Shift+V on macOS.
Select Enter to run the code or command.
Setting up
This section walks you through creating an API for Gremlin account and setting up a Python project to use the library to connect to the account.
Create an API for Gremlin account
The API for Gremlin account should be created prior to using the Python library. Additionally, it helps to also have the database and graph in place.
Create shell variables for accountName, resourceGroupName, and location.
# Variable for resource group name resourceGroupName="msdocs-cosmos-gremlin-quickstart" location="westus" # Variable for account name with a randomly generated suffix let suffix=$RANDOM*$RANDOM accountName="msdocs-gremlin-$suffix"
If you haven't already, sign in to the Azure CLI using
az login
.Use
az group create
to create a new resource group in your subscription.az group create \ --name $resourceGroupName \ --location $location
Use
az cosmosdb create
to create a new API for Gremlin account with default settings.az cosmosdb create \ --resource-group $resourceGroupName \ --name $accountName \ --capabilities "EnableGremlin" \ --locations regionName=$location \ --enable-free-tier true
Note
You can have up to one free tier Azure Cosmos DB account per Azure subscription and must opt-in when creating the account. If this command fails to apply the free tier discount, this means another account in the subscription has already been enabled with free tier.
Get the API for Gremlin endpoint NAME for the account using
az cosmosdb show
.az cosmosdb show \ --resource-group $resourceGroupName \ --name $accountName \ --query "name"
Find the KEY from the list of keys for the account with
az-cosmosdb-keys-list
.az cosmosdb keys list \ --resource-group $resourceGroupName \ --name $accountName \ --type "keys" \ --query "primaryMasterKey"
Record the NAME and KEY values. You use these credentials later.
Create a database named
cosmicworks
usingaz cosmosdb gremlin database create
.az cosmosdb gremlin database create \ --resource-group $resourceGroupName \ --account-name $accountName \ --name "cosmicworks"
Create a graph using
az cosmosdb gremlin graph create
. Name the graphproducts
, then set the throughput to400
, and finally set the partition key path to/category
.az cosmosdb gremlin graph create \ --resource-group $resourceGroupName \ --account-name $accountName \ --database-name "cosmicworks" \ --name "products" \ --partition-key-path "/category" \ --throughput 400
Create a new Python console application
Create a Python console application in an empty folder using your preferred terminal.
Open your terminal in an empty folder.
Create the app.py file.
touch app.py
Install the PyPI package
Add the gremlinpython
PyPI package to the Python project.
Create the requirements.txt file.
touch requirements.txt
Add the
gremlinpython
package from the Python Package Index to the requirements file.gremlinpython==3.7.0
Install all the requirements to your project.
python install -r requirements.txt
Configure environment variables
To use the NAME and URI values obtained earlier in this quickstart, persist them to new environment variables on the local machine running the application.
To set the environment variable, use your terminal to persist the values as
COSMOS_ENDPOINT
andCOSMOS_KEY
respectively.export COSMOS_GREMLIN_ENDPOINT="<account-name>" export COSMOS_GREMLIN_KEY="<account-key>"
Validate that the environment variables were set correctly.
printenv COSMOS_GREMLIN_ENDPOINT printenv COSMOS_GREMLIN_KEY
Code examples
The code in this article connects to a database named cosmicworks
and a graph named products
. The code then adds vertices and edges to the graph before traversing the added items.
Authenticate the client
Application requests to most Azure services must be authorized. For the API for Gremlin, use the NAME and URI values obtained earlier in this quickstart.
Open the app.py file.
Import
client
andserializer
from thegremlin_python.driver
module.import os from gremlin_python.driver import client, serializer
Warning
Depending on your version of Python, you may also need to import
asyncio
and override the event loop policy:import asyncio import sys if sys.platform == "win32": asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
Create
ACCOUNT_NAME
andACCOUNT_KEY
variables. Store theCOSMOS_GREMLIN_ENDPOINT
andCOSMOS_GREMLIN_KEY
environment variables as the values for each respective variable.ACCOUNT_NAME = os.environ["COSMOS_GREMLIN_ENDPOINT"] ACCOUNT_KEY = os.environ["COSMOS_GREMLIN_KEY"]
Use
Client
to connect using the account's credentials and the GraphSON 2.0 serializer.client = client.Client( url=f"wss://{ACCOUNT_NAME}.gremlin.cosmos.azure.com:443/", traversal_source="g", username="/dbs/cosmicworks/colls/products", password=f"{ACCOUNT_KEY}", message_serializer=serializer.GraphSONSerializersV2d0(), )
Create vertices
Now that the application is connected to the account, use the standard Gremlin syntax to create vertices.
Use
submit
to run a command server-side on the API for Gremlin account. Create a product vertex with the following properties:Value label product
id 68719518371
name
Kiama classic surfboard
price
285.55
category
surfboards
client.submit( message=( "g.addV('product')" ".property('id', prop_id)" ".property('name', prop_name)" ".property('price', prop_price)" ".property('category', prop_partition_key)" ), bindings={ "prop_id": "68719518371", "prop_name": "Kiama classic surfboard", "prop_price": 285.55, "prop_partition_key": "surfboards", }, )
Create a second product vertex with these properties:
Value label product
id 68719518403
name
Montau Turtle Surfboard
price
600.00
category
surfboards
client.submit( message=( "g.addV('product')" ".property('id', prop_id)" ".property('name', prop_name)" ".property('price', prop_price)" ".property('category', prop_partition_key)" ), bindings={ "prop_id": "68719518403", "prop_name": "Montau Turtle Surfboard", "prop_price": 600.00, "prop_partition_key": "surfboards", }, )
Create a third product vertex with these properties:
Value label product
id 68719518409
name
Bondi Twin Surfboard
price
585.50
category
surfboards
client.submit( message=( "g.addV('product')" ".property('id', prop_id)" ".property('name', prop_name)" ".property('price', prop_price)" ".property('category', prop_partition_key)" ), bindings={ "prop_id": "68719518409", "prop_name": "Bondi Twin Surfboard", "prop_price": 585.50, "prop_partition_key": "surfboards", }, )
Create edges
Create edges using the Gremlin syntax to define relationships between vertices.
Create an edge from the
Montau Turtle Surfboard
product named replaces to theKiama classic surfboard
product.client.submit( message=( "g.V([prop_partition_key, prop_source_id])" ".addE('replaces')" ".to(g.V([prop_partition_key, prop_target_id]))" ), bindings={ "prop_partition_key": "surfboards", "prop_source_id": "68719518403", "prop_target_id": "68719518371", }, )
Tip
This edge defintion uses the
g.V(['<partition-key>', '<id>'])
syntax. Alternatively, you can useg.V('<id>').has('category', '<partition-key>')
.Create another replaces edge from the same product to the
Bondi Twin Surfboard
.client.submit( message=( "g.V([prop_partition_key, prop_source_id])" ".addE('replaces')" ".to(g.V([prop_partition_key, prop_target_id]))" ), bindings={ "prop_partition_key": "surfboards", "prop_source_id": "68719518403", "prop_target_id": "68719518409", }, )
Query vertices & edges
Use the Gremlin syntax to traverse the graph and discover relationships between vertices.
Traverse the graph and find all vertices that
Montau Turtle Surfboard
replaces.result = client.submit( message=( "g.V().hasLabel('product')" ".has('category', prop_partition_key)" ".has('name', prop_name)" ".outE('replaces').inV()" ), bindings={ "prop_partition_key": "surfboards", "prop_name": "Montau Turtle Surfboard", }, )
Write to the console the result of this traversal.
print(result)
Run the code
Validate that your application works as expected by running the application. The application should execute with no errors or warnings. The output of the application includes data about the created and queried items.
Open the terminal in the Python project folder.
Use
python <filename>
to run the application. Observe the output from the application.python app.py
Clean up resources
When you no longer need the API for Gremlin account, delete the corresponding resource group.
Create a shell variable for resourceGroupName if it doesn't already exist.
# Variable for resource group name resourceGroupName="msdocs-cosmos-gremlin-quickstart"
Use
az group delete
to delete the resource group.az group delete \ --name $resourceGroupName