Редактиране

Споделяне чрез


Quickstart: Azure Cosmos DB for Apache Gremlin library for Python

APPLIES TO: Gremlin

Azure Cosmos DB for Apache Gremlin is a fully managed graph database service implementing the popular Apache Tinkerpop, a graph computing framework using the Gremlin query language. The API for Gremlin gives you a low-friction way to get started using Gremlin with a service that can grow and scale out as much as you need with minimal management.

In this quickstart, you use the gremlinpython library to connect to a newly created Azure Cosmos DB for Gremlin account.

Library source code | Package (PyPi)

Prerequisites

Azure Cloud Shell

Azure hosts Azure Cloud Shell, an interactive shell environment that you can use through your browser. You can use either Bash or PowerShell with Cloud Shell to work with Azure services. You can use the Cloud Shell preinstalled commands to run the code in this article, without having to install anything on your local environment.

To start Azure Cloud Shell:

Option Example/Link
Select Try It in the upper-right corner of a code or command block. Selecting Try It doesn't automatically copy the code or command to Cloud Shell. Screenshot that shows an example of Try It for Azure Cloud Shell.
Go to https://shell.azure.com, or select the Launch Cloud Shell button to open Cloud Shell in your browser. Button to launch Azure Cloud Shell.
Select the Cloud Shell button on the menu bar at the upper right in the Azure portal. Screenshot that shows the Cloud Shell button in the Azure portal

To use Azure Cloud Shell:

  1. Start Cloud Shell.

  2. Select the Copy button on a code block (or command block) to copy the code or command.

  3. Paste the code or command into the Cloud Shell session by selecting Ctrl+Shift+V on Windows and Linux, or by selecting Cmd+Shift+V on macOS.

  4. Select Enter to run the code or command.

Setting up

This section walks you through creating an API for Gremlin account and setting up a Python project to use the library to connect to the account.

Create an API for Gremlin account

The API for Gremlin account should be created prior to using the Python library. Additionally, it helps to also have the database and graph in place.

  1. Create shell variables for accountName, resourceGroupName, and location.

    # Variable for resource group name
    resourceGroupName="msdocs-cosmos-gremlin-quickstart"
    location="westus"
    
    # Variable for account name with a randomly generated suffix
    
    let suffix=$RANDOM*$RANDOM
    accountName="msdocs-gremlin-$suffix"
    
  2. If you haven't already, sign in to the Azure CLI using az login.

  3. Use az group create to create a new resource group in your subscription.

    az group create \
        --name $resourceGroupName \
        --location $location
    
  4. Use az cosmosdb create to create a new API for Gremlin account with default settings.

    az cosmosdb create \
        --resource-group $resourceGroupName \
        --name $accountName \
        --capabilities "EnableGremlin" \
        --locations regionName=$location \
        --enable-free-tier true
    

    Note

    You can have up to one free tier Azure Cosmos DB account per Azure subscription and must opt-in when creating the account. If this command fails to apply the free tier discount, this means another account in the subscription has already been enabled with free tier.

  5. Get the API for Gremlin endpoint NAME for the account using az cosmosdb show.

    az cosmosdb show \
        --resource-group $resourceGroupName \
        --name $accountName \
        --query "name"
    
  6. Find the KEY from the list of keys for the account with az-cosmosdb-keys-list.

    az cosmosdb keys list \
        --resource-group $resourceGroupName \
        --name $accountName \
        --type "keys" \
        --query "primaryMasterKey"
    
  7. Record the NAME and KEY values. You use these credentials later.

  8. Create a database named cosmicworks using az cosmosdb gremlin database create.

    az cosmosdb gremlin database create \
        --resource-group $resourceGroupName \
        --account-name $accountName \
        --name "cosmicworks"
    
  9. Create a graph using az cosmosdb gremlin graph create. Name the graph products, then set the throughput to 400, and finally set the partition key path to /category.

    az cosmosdb gremlin graph create \
        --resource-group $resourceGroupName \
        --account-name $accountName \
        --database-name "cosmicworks" \
        --name "products" \
        --partition-key-path "/category" \
        --throughput 400
    

Create a new Python console application

Create a Python console application in an empty folder using your preferred terminal.

  1. Open your terminal in an empty folder.

  2. Create the app.py file.

    touch app.py
    

Install the PyPI package

Add the gremlinpython PyPI package to the Python project.

  1. Create the requirements.txt file.

    touch requirements.txt
    
  2. Add the gremlinpython package from the Python Package Index to the requirements file.

    gremlinpython==3.7.0
    
  3. Install all the requirements to your project.

    python install -r requirements.txt
    

Configure environment variables

To use the NAME and URI values obtained earlier in this quickstart, persist them to new environment variables on the local machine running the application.

  1. To set the environment variable, use your terminal to persist the values as COSMOS_ENDPOINT and COSMOS_KEY respectively.

    export COSMOS_GREMLIN_ENDPOINT="<account-name>"
    export COSMOS_GREMLIN_KEY="<account-key>"
    
  2. Validate that the environment variables were set correctly.

    printenv COSMOS_GREMLIN_ENDPOINT
    printenv COSMOS_GREMLIN_KEY
    

Code examples

The code in this article connects to a database named cosmicworks and a graph named products. The code then adds vertices and edges to the graph before traversing the added items.

Authenticate the client

Application requests to most Azure services must be authorized. For the API for Gremlin, use the NAME and URI values obtained earlier in this quickstart.

  1. Open the app.py file.

  2. Import client and serializer from the gremlin_python.driver module.

    import os
    from gremlin_python.driver import client, serializer
    

    Warning

    Depending on your version of Python, you may also need to import asyncio and override the event loop policy:

    import asyncio
    import sys
    
    if sys.platform == "win32":
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    
  3. Create ACCOUNT_NAME and ACCOUNT_KEY variables. Store the COSMOS_GREMLIN_ENDPOINT and COSMOS_GREMLIN_KEY environment variables as the values for each respective variable.

    ACCOUNT_NAME = os.environ["COSMOS_GREMLIN_ENDPOINT"]
    ACCOUNT_KEY = os.environ["COSMOS_GREMLIN_KEY"]
    
  4. Use Client to connect using the account's credentials and the GraphSON 2.0 serializer.

    client = client.Client(
        url=f"wss://{ACCOUNT_NAME}.gremlin.cosmos.azure.com:443/",
        traversal_source="g",
        username="/dbs/cosmicworks/colls/products",
        password=f"{ACCOUNT_KEY}",
        message_serializer=serializer.GraphSONSerializersV2d0(),
    )
    

Create vertices

Now that the application is connected to the account, use the standard Gremlin syntax to create vertices.

  1. Use submit to run a command server-side on the API for Gremlin account. Create a product vertex with the following properties:

    Value
    label product
    id 68719518371
    name Kiama classic surfboard
    price 285.55
    category surfboards
    client.submit(
        message=(
            "g.addV('product')"
            ".property('id', prop_id)"
            ".property('name', prop_name)"
            ".property('price', prop_price)"
            ".property('category', prop_partition_key)"
        ),
        bindings={
            "prop_id": "68719518371",
            "prop_name": "Kiama classic surfboard",
            "prop_price": 285.55,
            "prop_partition_key": "surfboards",
        },
    )
    
  2. Create a second product vertex with these properties:

    Value
    label product
    id 68719518403
    name Montau Turtle Surfboard
    price 600.00
    category surfboards
    client.submit(
        message=(
            "g.addV('product')"
            ".property('id', prop_id)"
            ".property('name', prop_name)"
            ".property('price', prop_price)"
            ".property('category', prop_partition_key)"
        ),
        bindings={
            "prop_id": "68719518403",
            "prop_name": "Montau Turtle Surfboard",
            "prop_price": 600.00,
            "prop_partition_key": "surfboards",
        },
    )
    
  3. Create a third product vertex with these properties:

    Value
    label product
    id 68719518409
    name Bondi Twin Surfboard
    price 585.50
    category surfboards
    client.submit(
        message=(
            "g.addV('product')"
            ".property('id', prop_id)"
            ".property('name', prop_name)"
            ".property('price', prop_price)"
            ".property('category', prop_partition_key)"
        ),
        bindings={
            "prop_id": "68719518409",
            "prop_name": "Bondi Twin Surfboard",
            "prop_price": 585.50,
            "prop_partition_key": "surfboards",
        },
    )
    

Create edges

Create edges using the Gremlin syntax to define relationships between vertices.

  1. Create an edge from the Montau Turtle Surfboard product named replaces to the Kiama classic surfboard product.

    client.submit(
        message=(
            "g.V([prop_partition_key, prop_source_id])"
            ".addE('replaces')"
            ".to(g.V([prop_partition_key, prop_target_id]))"
        ),
        bindings={
            "prop_partition_key": "surfboards",
            "prop_source_id": "68719518403",
            "prop_target_id": "68719518371",
        },
    )
    

    Tip

    This edge defintion uses the g.V(['<partition-key>', '<id>']) syntax. Alternatively, you can use g.V('<id>').has('category', '<partition-key>').

  2. Create another replaces edge from the same product to the Bondi Twin Surfboard.

    client.submit(
        message=(
            "g.V([prop_partition_key, prop_source_id])"
            ".addE('replaces')"
            ".to(g.V([prop_partition_key, prop_target_id]))"
        ),
        bindings={
            "prop_partition_key": "surfboards",
            "prop_source_id": "68719518403",
            "prop_target_id": "68719518409",
        },
    )
    

Query vertices & edges

Use the Gremlin syntax to traverse the graph and discover relationships between vertices.

  1. Traverse the graph and find all vertices that Montau Turtle Surfboard replaces.

    result = client.submit(
        message=(
            "g.V().hasLabel('product')"
            ".has('category', prop_partition_key)"
            ".has('name', prop_name)"
            ".outE('replaces').inV()"
        ),
        bindings={
            "prop_partition_key": "surfboards",
            "prop_name": "Montau Turtle Surfboard",
        },
    )
    
  2. Write to the console the result of this traversal.

    print(result)
    

Run the code

Validate that your application works as expected by running the application. The application should execute with no errors or warnings. The output of the application includes data about the created and queried items.

  1. Open the terminal in the Python project folder.

  2. Use python <filename> to run the application. Observe the output from the application.

    python app.py
    

Clean up resources

When you no longer need the API for Gremlin account, delete the corresponding resource group.

  1. Create a shell variable for resourceGroupName if it doesn't already exist.

    # Variable for resource group name
    resourceGroupName="msdocs-cosmos-gremlin-quickstart"
    
  2. Use az group delete to delete the resource group.

    az group delete \
        --name $resourceGroupName
    

Next step