Azure Data Manager for Energy indexing and search workflows

Article
04/28/2024

All data and associated metadata ingested into the platform are indexed to enable search. The metadata is accessible to ensure awareness even when the data isn't available.

Indexer Service

The Indexer Service provides a mechanism for indexing documents that contain structured and unstructured data.

Note

This service is not a public service and only meant to be called internally by other core platform services.

Indexing workflow

The below diagram illustrates the Indexing workflow:

When a customer loads data into the platform, the associated metadata is ingested using the Storage service. The Storage service provides a set of APIs to manage the entire metadata lifecycle such as ingestion (persistence), modification, deletion, versioning, retrieval, and data schema management. Each storage metadata record created by the Storage service contains a kind parameter that refers to an underlying schema. This schema determines the attributes that will be indexed by the Indexer service.

When the Storage service creates a metadata record, it raises a recordChangedMessages event that is collected in the Azure Service Bus (message queue). The Indexer queue service pulls the message from the Azure Service Bus, performs basic validation and sends it over to the Indexer service. If there are any failures in sending the messages to the Indexer service, the Indexer queue service retries sending the message up to a maximum allowed configurable retry count. If the retry attempts fail, a negative acknowledgment is sent to the Azure Service Bus, which then archives the message.

When the recordChangedMessages event is received by the Indexer Service, it fetches the required schemas from the schema cache or through the Schema service APIs. The Indexer Service then creates a new index within Elasticsearch (if not already present), and then sends a bulk query to create or update the records as needed. If the response from Elasticsearch is a failure response of type service unavailable or request timed out, then the Indexer Service creates recordChangedMessages for these failed record IDs and puts the message in the Azure Service Bus. These messages will again be pulled by the Indexer Queue service and will follow the same flow as before.

For more information, see Indexer service OSDU® documentation provides information on indexer service

Search workflow

Search service provides a mechanism for discovering indexed metadata documents. The Search API supports full-text search on string fields, range queries on date, numeric, or string field, etc. along with geo-spatial searches.

When metadata records are loaded onto the Platform using Storage service, we can configure permissions for viewers and owners of the metadata records under the acl field. The viewers and owners are assigned via groups as defined in the Entitlement service. When performing a search as a user, the matched metadata records will only show up for users who are assigned to the Group.

For a detailed tutorial on Search service, refer Search service OSDU® documentation

Reindex workflow

Reindex API allows users to reindex a kind without reingesting the records via storage API. For detailed information, refer to Reindex OSDU® documentation

OSDU® is a trademark of The Open Group.

Next steps

Domain data management service concepts

Additional resources

Documentation

Entitlement concepts in Azure Data Manager for Energy

This article describes various concepts of the entitlement service in Azure Data Manager for Energy.
Tutorial: Work with well data records by using Wellbore DDMS APIs - Microsoft Azure Data Manager for Energy

Learn how to work with well data records in your Azure Data Manager for Energy instance by using Wellbore Domain Data Management Services (DDMS) APIs in Postman.
Microsoft Azure Data Manager for Energy csv parser ingestion workflow concept

Learn how to use CSV parser ingestion.
Tutorial: Work with well data records by using Well Delivery DDMS APIs

Learn how to work with well data records in your Azure Data Manager for Energy instance by using Well Delivery Domain Data Management Services (DDMS) APIs in Postman.
Domain data management services concepts

Learn how to use Domain Data Management Services
Authentication concepts in Microsoft Azure Data Manager for Energy

This article describes various concepts of authentication in Azure Data Manager for Energy.
OSDU Services Available on Azure Data Manager for Energy

This article provides an overview of the OSDU services available on Azure Data Manager for Energy and the OSDU services that are exclusively available in the community version.
Microsoft Azure Data Manager for Energy manifest ingestion concepts

This article describes manifest ingestion concepts

Training

Module

Search Azure Cosmos DB for NoSQL data with Azure Cognitive Search - Training

Index Azure Cosmos DB for NoSQL data with Azure Cognitive Search.

Certification

Microsoft Certified: Azure Data Engineer Associate - Certifications

Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.

Share via