How to register an External Data Sources (EDS) with Azure Data Manager for Energy?

This article explains how to register an External Data Sources (EDS) with Azure Data Manager for energy. EDS allow you to fetch and ingest data (metadata) from external data sources. It also allows you to retrieve bulk data on demand.

Prerequisites

  • Download and import API collection and environment files into API test client (like Postman). Make appropriate modifications in environment based on your data source.
  • Refer Section 2.2 in osdu-eds-data-supplier-enablement-guide for details on Data source Registration.
  • Review Connected Source Registry Entry (CSRE) and Connection Source Data Job (CSDJ) sections in EDS_Documentation-1.0.docx to understand the various parameters used in data source registration.
  • To run EDS, the user must be a member of service.eds.user entitlements group. Additionally, to access Secret service, the user should be a member of the following entitlements: service.secret.viewer, service.secret.editor, service.secret.admin.

EDS Fetch and Ingest workflow

Execute the APIs in the following collections to register your external data source that runs EDS Fetch and Ingest workflow on a schedule:

  1. 001: Pre-req: Validate Schema Registration
  2. 002: Pre-req: Validate Reference Data
  3. 003: Secret Service
  4. 004: Pre-req: Add Source Registry

After successful data registration, data is regularly fetched from external sources and added to your Azure Data Manager for Energy.

You can use the Search service to search for your ingested data.

Troubleshooting

You could run the below Kusto queries in your Log analytics workspace to identify any issues with Data Source registration.

OEPAirFlowTask 
| where DagName == "eds_ingest"        
| where LogLevel == "ERROR" // ERROR/DEBUG/INFO/WARNING
OEPAirFlowTask 
| where DagName == "eds_scheduler"        
| where LogLevel == "ERROR" // ERROR/DEBUG/INFO/WARNING

Retrieve bulk data on demand

Use getRetrievalInstructions API in 005: Dataset Service collection to retrieve bulk data from external data sources on demand.

References