Quickstart: Create, query, and traverse an Azure Cosmos DB graph database using the Gremlin console
APPLIES TO:
Gremlin
Azure Cosmos DB is Microsoft's globally distributed multi-model database service. You can quickly create and query document, key/value, and graph databases, all of which benefit from the global distribution and horizontal scale capabilities at the core of Azure Cosmos DB.
This quickstart demonstrates how to create an Azure Cosmos DB Gremlin API account, database, and graph (container) using the Azure portal and then use the Gremlin Console from Apache TinkerPop to work with Gremlin API data. In this tutorial, you create and query vertices and edges, updating a vertex property, query vertices, traverse the graph, and drop a vertex.
The Gremlin console is Groovy/Java based and runs on Linux, Mac, and Windows. You can download it from the Apache TinkerPop site.
Prerequisites
You need to have an Azure subscription to create an Azure Cosmos DB account for this quickstart.
If you don't have an Azure subscription, create an Azure free account before you begin.
You also need to install the Gremlin Console. The recommended version is v3.4.13. (To use Gremlin Console on Windows, you need to install Java Runtime, minimum requires Java 8 but it is preferable to use Java 11).
Create a database account
In a new browser window, sign in to the Azure portal.
In the left menu, select Create a resource.
On the New page, select Databases > Azure Cosmos DB.
On the Create Azure Cosmos DB Account page, enter the settings for the new Azure Cosmos DB account.
Setting Value Description Subscription Subscription name Select the Azure subscription that you want to use for this Azure Cosmos DB account. Resource Group Resource group name Select a resource group, or select Create new, then enter a unique name for the new resource group. Account Name Enter a unique name Enter a unique name to identify your Azure Cosmos DB account. Your account URI will be gremlin.azure.com appended to your unique account name.
The account name can use only lowercase letters, numbers, and hyphens (-), and must be between 3 and 44 characters long.API Gremlin (graph) The API determines the type of account to create. Azure Cosmos DB provides five APIs: NoSQL for document databases, Gremlin for graph databases, MongoDB for document databases, Azure Table, and Cassandra. You must create a separate account for each API.
Select Gremlin (graph), because in this quickstart you are creating a table that works with the API for Gremlin.
Learn more about the API for Gremlin.Location The region closest to your users Select a geographic location to host your Azure Cosmos DB account. Use the location that is closest to your users to give them the fastest access to the data. Capacity mode Provisioned throughput or Serverless Select Provisioned throughput to create an account in provisioned throughput mode. Select Serverless to create an account in serverless mode. Apply Azure Cosmos DB free tier discount Apply or Do not apply With Azure Cosmos DB free tier, you will get the first 1000 RU/s and 25 GB of storage for free in an account. Learn more about free tier. Note
You can have up to one free tier Azure Cosmos DB account per Azure subscription and must opt-in when creating the account. If you do not see the option to apply the free tier discount, this means another account in the subscription has already been enabled with free tier.
In the Global Distribution tab, configure the following details. You can leave the default values for the purpose of this quickstart:
Setting Value Description Geo-Redundancy Disable Enable or disable global distribution on your account by pairing your region with a pair region. You can add more regions to your account later. Multi-region Writes Disable Multi-region writes capability allows you to take advantage of the provisioned throughput for your databases and containers across the globe. Note
The following options are not available if you select Serverless as the Capacity mode:
- Apply Free Tier Discount
- Geo-redundancy
- Multi-region Writes
Optionally you can configure additional details in the following tabs:
- Networking - Configure access from a virtual network.
- Backup Policy - Configure either periodic or continuous backup policy.
- Encryption - Use either service-managed key or a customer-managed key.
- Tags - Tags are name/value pairs that enable you to categorize resources and view consolidated billing by applying the same tag to multiple resources and resource groups.
Select Review + create.
The account creation takes a few minutes. Wait for the portal to display the Congratulations! Your Azure Cosmos DB account was created page.
Add a graph
You can now use the Data Explorer tool in the Azure portal to create a graph database.
Select Data Explorer > New Graph.
The Add Graph area is displayed on the far right, you may need to scroll right to see it.
In the Add graph page, enter the settings for the new graph.
Setting Suggested value Description Database ID sample-database Enter sample-database as the name for the new database. Database names must be between 1 and 255 characters, and can't contain / \ # ?
or a trailing space.Throughput 400 RUs Change the throughput to 400 request units per second (RU/s). If you want to reduce latency, you can scale up the throughput later. If you chose serverless capacity mode, then throughput isn't required. Graph ID sample-graph Enter sample-graph as the name for your new collection. Graph names have the same character requirements as database IDs. Partition Key /pk All Azure Cosmos DB accounts need a partition key to horizontally scale. Learn how to select an appropriate partition key in the Graph Data Partitioning article. Once the form is filled out, select OK.
Connect to your app service/Graph
Before starting the Gremlin Console, create or modify the remote-secure.yaml configuration file in the
apache-tinkerpop-gremlin-console-3.2.5/conf
directory.Fill in your host, port, username, password, connectionPool, and serializer configurations as defined in the following table:
Setting Suggested value Description hosts [account-name.gremlin.cosmos.azure.com] See the following screenshot. This is the Gremlin URI value on the Overview page of the Azure portal, in square brackets, with the trailing :443/ removed. Note: Be sure to use the Gremlin value, and not the URI that ends with [account-name.documents.azure.com] which would likely result in a "Host did not respond in a timely fashion" exception when attempting to execute Gremlin queries later. port 443 Set to 443. username Your username The resource of the form /dbs/<db>/colls/<coll>
where<db>
is your database name and<coll>
is your collection name.password Your primary key See second screenshot below. This is your primary key, which you can retrieve from the Keys page of the Azure portal, in the Primary Key box. Use the copy button on the left side of the box to copy the value. connectionPool {enableSsl: true} Your connection pool setting for TLS. serializer { className: org.apache.tinkerpop.gremlin.
driver.ser.GraphSONMessageSerializerV2d0,
config: { serializeResultToString: true }}Set to this value and delete any \n
line breaks when pasting in the value.For the hosts value, copy the Gremlin URI value from the Overview page:
For the password value, copy the Primary key from the Keys page:
Your remote-secure.yaml file should look like this:
hosts: [your_database_server.gremlin.cosmos.azure.com] port: 443 username: /dbs/your_database/colls/your_collection password: your_primary_key connectionPool: { enableSsl: true } serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV2d0, config: { serializeResultToString: true }}
make sure to wrap the value of hosts parameter within brackets [].
In your terminal, run
bin/gremlin.bat
orbin/gremlin.sh
to start the Gremlin Console.In your terminal, run
:remote connect tinkerpop.server conf/remote-secure.yaml
to connect to your app service.Tip
If you receive the error
No appenders could be found for logger
ensure that you updated the serializer value in the remote-secure.yaml file as described in step 2. If your configuration is correct, then this warning can be safely ignored as it should not impact the use of the console.Next run
:remote console
to redirect all console commands to the remote server.Note
If you don't run the
:remote console
command but would like to redirect all console commands to the remote server, you should prefix the command with:>
, for example you should run the command as:> g.V().count()
. This prefix is a part of the command and it is important when using the Gremlin console with Azure Cosmos DB. Omitting this prefix instructs the console to execute the command locally, often against an in-memory graph. Using this prefix:>
tells the console to execute a remote command, in this case against Azure Cosmos DB (either the localhost emulator, or an Azure instance).
Great! Now that we finished the setup, let's start running some console commands.
Let's try a simple count() command. Type the following into the console at the prompt:
g.V().count()
Create vertices and edges
Let's begin by adding five person vertices for Thomas, Mary Kay, Robin, Ben, and Jack.
Input (Thomas):
g.addV('person').property('firstName', 'Thomas').property('lastName', 'Andersen').property('age', 44).property('userid', 1).property('pk', 'pk')
Output:
==>[id:796cdccc-2acd-4e58-a324-91d6f6f5ed6d,label:person,type:vertex,properties:[firstName:[[id:f02a749f-b67c-4016-850e-910242d68953,value:Thomas]],lastName:[[id:f5fa3126-8818-4fda-88b0-9bb55145ce5c,value:Andersen]],age:[[id:f6390f9c-e563-433e-acbf-25627628016e,value:44]],userid:[[id:796cdccc-2acd-4e58-a324-91d6f6f5ed6d|userid,value:1]]]]
Input (Mary Kay):
g.addV('person').property('firstName', 'Mary Kay').property('lastName', 'Andersen').property('age', 39).property('userid', 2).property('pk', 'pk')
Output:
==>[id:0ac9be25-a476-4a30-8da8-e79f0119ea5e,label:person,type:vertex,properties:[firstName:[[id:ea0604f8-14ee-4513-a48a-1734a1f28dc0,value:Mary Kay]],lastName:[[id:86d3bba5-fd60-4856-9396-c195ef7d7f4b,value:Andersen]],age:[[id:bc81b78d-30c4-4e03-8f40-50f72eb5f6da,value:39]],userid:[[id:0ac9be25-a476-4a30-8da8-e79f0119ea5e|userid,value:2]]]]
Input (Robin):
g.addV('person').property('firstName', 'Robin').property('lastName', 'Wakefield').property('userid', 3).property('pk', 'pk')
Output:
==>[id:8dc14d6a-8683-4a54-8d74-7eef1fb43a3e,label:person,type:vertex,properties:[firstName:[[id:ec65f078-7a43-4cbe-bc06-e50f2640dc4e,value:Robin]],lastName:[[id:a3937d07-0e88-45d3-a442-26fcdfb042ce,value:Wakefield]],userid:[[id:8dc14d6a-8683-4a54-8d74-7eef1fb43a3e|userid,value:3]]]]
Input (Ben):
g.addV('person').property('firstName', 'Ben').property('lastName', 'Miller').property('userid', 4).property('pk', 'pk')
Output:
==>[id:ee86b670-4d24-4966-9a39-30529284b66f,label:person,type:vertex,properties:[firstName:[[id:a632469b-30fc-4157-840c-b80260871e9a,value:Ben]],lastName:[[id:4a08d307-0719-47c6-84ae-1b0b06630928,value:Miller]],userid:[[id:ee86b670-4d24-4966-9a39-30529284b66f|userid,value:4]]]]
Input (Jack):
g.addV('person').property('firstName', 'Jack').property('lastName', 'Connor').property('userid', 5).property('pk', 'pk')
Output:
==>[id:4c835f2a-ea5b-43bb-9b6b-215488ad8469,label:person,type:vertex,properties:[firstName:[[id:4250824e-4b72-417f-af98-8034aa15559f,value:Jack]],lastName:[[id:44c1d5e1-a831-480a-bf94-5167d133549e,value:Connor]],userid:[[id:4c835f2a-ea5b-43bb-9b6b-215488ad8469|userid,value:5]]]]
Next, let's add edges for relationships between our people.
Input (Thomas -> Mary Kay):
g.V().hasLabel('person').has('firstName', 'Thomas').addE('knows').to(g.V().hasLabel('person').has('firstName', 'Mary Kay'))
Output:
==>[id:c12bf9fb-96a1-4cb7-a3f8-431e196e702f,label:knows,type:edge,inVLabel:person,outVLabel:person,inV:0d1fa428-780c-49a5-bd3a-a68d96391d5c,outV:1ce821c6-aa3d-4170-a0b7-d14d2a4d18c3]
Input (Thomas -> Robin):
g.V().hasLabel('person').has('firstName', 'Thomas').addE('knows').to(g.V().hasLabel('person').has('firstName', 'Robin'))
Output:
==>[id:58319bdd-1d3e-4f17-a106-0ddf18719d15,label:knows,type:edge,inVLabel:person,outVLabel:person,inV:3e324073-ccfc-4ae1-8675-d450858ca116,outV:1ce821c6-aa3d-4170-a0b7-d14d2a4d18c3]
Input (Robin -> Ben):
g.V().hasLabel('person').has('firstName', 'Robin').addE('knows').to(g.V().hasLabel('person').has('firstName', 'Ben'))
Output:
==>[id:889c4d3c-549e-4d35-bc21-a3d1bfa11e00,label:knows,type:edge,inVLabel:person,outVLabel:person,inV:40fd641d-546e-412a-abcc-58fe53891aab,outV:3e324073-ccfc-4ae1-8675-d450858ca116]
Update a vertex
Let's update the Thomas vertex with a new age of 45.
Input:
g.V().hasLabel('person').has('firstName', 'Thomas').property('age', 45)
Output:
==>[id:ae36f938-210e-445a-92df-519f2b64c8ec,label:person,type:vertex,properties:[firstName:[[id:872090b6-6a77-456a-9a55-a59141d4ebc2,value:Thomas]],lastName:[[id:7ee7a39a-a414-4127-89b4-870bc4ef99f3,value:Andersen]],age:[[id:a2a75d5a-ae70-4095-806d-a35abcbfe71d,value:45]]]]
Query your graph
Now, let's run a variety of queries against your graph.
First, let's try a query with a filter to return only people who are older than 40 years old.
Input (filter query):
g.V().hasLabel('person').has('age', gt(40))
Output:
==>[id:ae36f938-210e-445a-92df-519f2b64c8ec,label:person,type:vertex,properties:[firstName:[[id:872090b6-6a77-456a-9a55-a59141d4ebc2,value:Thomas]],lastName:[[id:7ee7a39a-a414-4127-89b4-870bc4ef99f3,value:Andersen]],age:[[id:a2a75d5a-ae70-4095-806d-a35abcbfe71d,value:45]]]]
Next, let's project the first name for the people who are older than 40 years old.
Input (filter + projection query):
g.V().hasLabel('person').has('age', gt(40)).values('firstName')
Output:
==>Thomas
Traverse your graph
Let's traverse the graph to return all of Thomas's friends.
Input (friends of Thomas):
g.V().hasLabel('person').has('firstName', 'Thomas').outE('knows').inV().hasLabel('person')
Output:
==>[id:f04bc00b-cb56-46c4-a3bb-a5870c42f7ff,label:person,type:vertex,properties:[firstName:[[id:14feedec-b070-444e-b544-62be15c7167c,value:Mary Kay]],lastName:[[id:107ab421-7208-45d4-b969-bbc54481992a,value:Andersen]],age:[[id:4b08d6e4-58f5-45df-8e69-6b790b692e0a,value:39]]]]
==>[id:91605c63-4988-4b60-9a30-5144719ae326,label:person,type:vertex,properties:[firstName:[[id:f760e0e6-652a-481a-92b0-1767d9bf372e,value:Robin]],lastName:[[id:352a4caa-bad6-47e3-a7dc-90ff342cf870,value:Wakefield]]]]
Next, let's get the next layer of vertices. Traverse the graph to return all the friends of Thomas's friends.
Input (friends of friends of Thomas):
g.V().hasLabel('person').has('firstName', 'Thomas').outE('knows').inV().hasLabel('person').outE('knows').inV().hasLabel('person')
Output:
==>[id:a801a0cb-ee85-44ee-a502-271685ef212e,label:person,type:vertex,properties:[firstName:[[id:b9489902-d29a-4673-8c09-c2b3fe7f8b94,value:Ben]],lastName:[[id:e084f933-9a4b-4dbc-8273-f0171265cf1d,value:Miller]]]]
Drop a vertex
Let's now delete a vertex from the graph database.
Input (drop Jack vertex):
g.V().hasLabel('person').has('firstName', 'Jack').drop()
Clear your graph
Finally, let's clear the database of all vertices and edges.
Input:
g.E().drop()
g.V().drop()
Congratulations! You've completed this Azure Cosmos DB: Gremlin API tutorial!
Review SLAs in the Azure portal
The Azure portal monitors your Azure Cosmos DB account throughput, storage, availability, latency, and consistency. Charts for metrics associated with an Azure Cosmos DB Service Level Agreement (SLA) show the SLA value compared to actual performance. This suite of metrics makes monitoring your SLAs transparent.
To review metrics and SLAs:
Select Metrics in your Azure Cosmos DB account's navigation menu.
Select a tab such as Latency, and select a timeframe on the right. Compare the Actual and SLA lines on the charts.
Review the metrics on the other tabs.
Clean up resources
When you're done with your app and Azure Cosmos DB account, you can delete the Azure resources you created so you don't incur more charges. To delete the resources:
In the Azure portal Search bar, search for and select Resource groups.
From the list, select the resource group you created for this quickstart.
On the resource group Overview page, select Delete resource group.
In the next window, enter the name of the resource group to delete, and then select Delete.
Next steps
In this quickstart, you've learned how to create an Azure Cosmos DB account, create a graph using the Data Explorer, create vertices and edges, and traverse your graph using the Gremlin console. You can now build more complex queries and implement powerful graph traversal logic using Gremlin.
Feedback
Submit and view feedback for