Explore the ESG data estate capability

Completed

The ESG data estate capability, which is part of the sustainability data solutions in Microsoft Fabric, includes a standard schema and the ability to transform data from different sources into the schema. The standardized schema provides data models for environmental, social, and governance records. The system uses the standardized data to compute quantitative metrics that meet disclosure reporting requirements. The solution includes customizable notebooks, so you have the flexibility to transform and compute the metrics.

The ESG data estate consists of the following lakehouses.

Lakehouse Description
ConfigAndDemoData Stores certain transformation libraries, reference, and demo data.
Ingested raw data Stores raw data from external data sources.
Processed ESG data Stores harmonized data that conforms to a standardized ESG data model.
Computed ESG metrics Stores computed ESG metrics and aggregated analytical datasets.

ConfigAndDemoData

Data in the ConfigAndDemoData lakehouse is for demonstration purposes, but it's also for the configuration information for the other lakehouses and analysis definitions. It includes libraries and reference data for data processing and transformation workflows. This lakehouse is useful when an organization is setting up or testing sustainability solutions with prebuilt data and resources that facilitate the setup process.

Ingested raw data

The Ingested raw data lakehouse stores the source data as it is, with no transformations performed. This centralized storage keeps all raw data in a single place that helps make managing and accessing the data easier. Additionally, the lakehouse preserves the ingested raw data in the original form, and it's useful for auditing and compliance. With this flexible foundation, an organization can extract and transform data for analysis.

Processed ESG data

The Processed ESG data lakehouse stores the transformed and harmonized data and conforms it to standardized ESG data models. Regardless of the source, this lakehouse holds the data that's standardized in a schema that's consistent and ready for further analysis. The quality of the data in this lakehouse is crucial for accurate reporting and analysis. The system uses the data to compute analytical datasets and metrics that organizations use for sustainability reporting and compliance. This processed data is also ready for other downstream activities, such as inclusion in applications that might include auditing and advanced analytics.

Computed ESG metrics

The Computed ESG metrics lakehouse has several important purposes. As a centralized metrics storage, it stores analytical datasets and metrics that the system processes in the processed ESG data lakehouse. The system analyzes the data in this lakehouse for insights into an organization's environmental, social, and governance impact. This helps with an organization's strategic decision making and initiatives for sustainability.

Data transformation after ingestion

The data journey between the lakehouses completes the following common tasks:

  • Data extraction

  • Data transformation

  • Data computing

Several actions and tools help in this transformation process.

Process Actions Tools
Data extraction Extract raw data from the ingested raw data lakehouse. Microsoft Dataverse, Microsoft Fabric
Data transformation Clean, harmonize, and enrich data. Microsoft Dataverse, Microsoft Fabric
Data computation Aggregate, calculate, and standardize ESG metrics. Microsoft Sustainability Manager

Explore the ESG data model

You can build a comprehensive Environmental, Social, and Governance (ESG) data estate by using the extensible analytical data model. ESG data estate provides a data model to effectively manage and process your data for reporting and advanced analytics. The model includes more than 400 tables in the following areas:

  • Carbon

  • Water

  • Waste

  • Biodiversity

  • Social

  • Governance

Screenshot of the E S G data model and the areas that it covers.

This data model helps you track your organization's key goals and metrics that pertain to different sustainability regulatory reporting directives and standards. You can use the tables based on your sustainability use cases and transform your source ESG data from disparate source systems into the tables. As a result, you can establish a standardized ESG data estate.

ESG data model resources

The ESG data model resources are:

  • Data dictionary Excel file - Includes data definitions and examples for all tables and columns that help you select the appropriate table for your sustainability use cases.

  • Data model PDF files - Contain a subset of tables from each business area. With these files, you can determine how to combine tables to meet regulatory reporting requirements for carbon emissions, water, waste, social, and governance areas.

  • Synapse database template - Designed for the data model on Microsoft Azure Synapse, this template helps make it easier for you to explore and visualize tables and relationships in the model.

ESG data model entities help you manage and process data for advanced analytics and reporting. For more information on the data model and to locate the resources, see ESG data model and ESG data model entities.

Demo data

Each of the following capabilities has demo data:

  • ESG data estate

  • Social and governance metrics and reports

  • Environmental metrics and analytics

  • Microsoft Azure emissions insights

In this module, you explore ESG data estate demo data and social and governance metrics and reports demo data.

ESG data estate demo data

The system deploys demo data across sustainability areas (such as water, waste, emissions, and social) to your workspace in the Demo data folder in the ConfigAndDemoData lakehouse. The demo data can help you quickly explore the functionality. The demo data is in the standard ESG data model schema so that you can compute analytical datasets and metrics and explore the Power BI report with these steps:

  1. Run the LoadDemoDataInProcessedESGDataTables notebook to copy the demo data from the Demo data folder in the ConfigAndDemoData lakehouse to ProcessedESGData lakehouse tables.

  2. After the notebook runs successfully, you can view the loaded tables in the ProcessedESGData lakehouse under Tables.

Currently, ESG data estate only supports full refresh snapshots. Before you load the demo data, ensure that you don't have your data in the ProcessedESGData lakehouse in the tables that this notebook is loading. If data exists, it's overwritten.