This feature is currently in preview. The Supplemental Terms of Use for Microsoft Azure Previews include additional legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability.
In this tutorial, learn to programmatically interact with Microsoft Purview using the GraphQL API. For more information about GraphQL in general, see this introduction to GraphQL.
Using GraphQL is similar to using the REST APIs, in that you send a JSON payload to a service endpoint. However, GraphQL allows us to return complete information in a single fetch, eliminating the need for multiple API calls.
With its introspection feature, the GraphQL API becomes self-descriptive, enabling clients to retrieve schema details such as available queries, types, and query parameters. See more about introspection.
query {
entities(where: { guid: ["<guid1>", "<guid2>"] }) { #Values in the array are combined as a logical-OR.
guid
createTime
updateTime
typeName
attributes
name
qualifiedName
description
}
}
query {
entities(where: { guid: "<guid>" }) {
guid
typeName
attributes
assignedTerms {
confidence
createdBy
description
expression
steward
source
status
term {
qualifiedName
name
shortDescription
longDescription
}
}
}
}
- with filtered glossary terms
query {
entities(where: { guid: "<guid>" }) {
guid
typeName
attributes
assignedTerms {
confidence
createdBy
description
expression
steward
source
status
term {
qualifiedName
name
shortDescription
longDescription
}
}
}
}
Filtering (preview)
The performance of exact matching for ‘GUID’ and ‘Qualified-Name’ is guaranteed in the examples provided in the 'basic queries' section. However, there are some limitations for other filtering patterns:
Filtering on Non-indexed Fields: Fields other than GUID & qualified name are currently not indexed (as examples in the “Simple filter” section). Filtering over nonindexed fields without criteria on GUID/Qualified-name will result in a table scan and might cause performance issues on large datasets.
Nested Filtering: Similar to nonindexed fields, nested filtering can cause table scans, which might cause performance issues on large datasets. For example, finding an entity with a linked classification/term/related entity.
Despite these limitations, this call pattern is superior to client-side filtering and is currently used by our internal client.
Queries are constrained by the cost of query execution. The maximum allowable cost is set at 100 units.
The execution fetch cost is computed each time we aim to retrieve related entities, assigned terms, or classifications for a given entity or term.
Example
Consider a scenario where we query 3 entities, each with two related entities. The cost calculation would be as follows:
One unit for the root query
Three units for each level-1 entity
Hence, the total cost for this query would be 1 (root query) + 3 (level-1 entities) = 4.
Filtering performance
Filtering is currently preview and has some limitations. See filtering for more information.
GraphQL queries begin with a root query that retrieves the top-level nodes. It then recursively fetches the related nodes.
The performance of a nested query is primarily determined by the root query because the related nodes are fetched from a known starting point, similar to a foreign key in SQL.
To optimize performance, it’s crucial to avoid wildcard root queries that could trigger a table scan on the top-level nodes.
For instance, the following query could cause performance issues on large datasets because the name field isn't indexed:
Learn how GraphQL in Microsoft Fabric works, the key concepts, and practical examples to help users integrate their applications with GraphQL effectively as part of their solutions.