Cosmos db Gremlin API repeat query taking more than 15 seconds to query the data

Question

We are using COSMOS DB Gremlin API for Graph DB. We are storing contact information between two people (vertices) along with some metadata of interaction saved as edge property. A person can connect with any number of people ranging from few to 1000s of people. We are also interested in getting the second level connect of the first connect. So overall interaction may easily involve 5000 - 10000 edges. When we try to query this data using repeat query upto 2 level. It takes aroun 15 seconds to fetch this data even when we increase the allocated RU or if we select Auto scale. The production will have heavier load than this, somewhere 100,000 vertex and 900,000 edges. Any idea how the query can be improved or is there is guidance on optimized performance.

Basic query to get this data:
g.V({id of a person).repeat(inE().has({filter for edge})).outV().dedup()).times(2).emit().tree()

Answer

@Shardul Pradhan

We checked this further with our Product group as well and they have conveyed that use of traversal. inE() should be avoided and data should be modeled to do outE().

Our Graph Modeling document explains the same in detail.

Please go through the same and get back to us with any questions and we will look into it further.

Thanks
Navtej S

Share via

Cosmos db Gremlin API repeat query taking more than 15 seconds to query the data

1 answer

Your answer