Explore environment data and insights
The first step in ingesting data into Sustainability data solutions in Microsoft Fabric is to get the data from Microsoft Sustainability Manager into the IngestedRawData lakehouse. Sustainability Manager uses Microsoft Dataverse to store its data. You can use any of the following options to make the Dataverse data available for ingestion in the IngestedRawData lakehouse.
Fabric shortcut to Dataverse – Use the Fabric shortcut capability to choose Dataverse as the source and create a Fabric shortcut in the IngestedRawData lakehouse. This approach assesses Dataverse by using the System Administrator access. You can set up this approach in the Fabric portal. When you use this approach, the system uses the data directly and doesn't stage it for loading into lakehouse tables. For more information, see Ingest data with a Dataverse shortcut.
Microsoft Azure Synapse Link – When you use the Dataverse capability to link to Azure Synapse, the system continuously writes the data to Azure Data Lake Storage Gen2. To perform this task, you require Dataverse System Administrator access and access to the storage account to set up the connection from Microsoft Power Apps maker portal in the data management section. Similar to the first approach, you must have a Fabric shortcut to Azure Data Lake Storage, and you can access the data from Data Lake Storage when the ingestion notebook runs. This approach uses access keys for storage and managed identities for Dataverse access, and it isn't tied to an individual user account. You can complete setup for this approach in the Fabric portal. For more information, see Ingest data with Azure Synapse Link.
Fabric Link - Use the Dataverse capability to link to a Fabric workspace and to have the system continuously write Dataverse data to a new Fabric lakehouse. You need to set up the connection to have Dataverse System Administration access and to access the Fabric workspace. This approach is similar to the Azure Synapse approach; however, you don’t have an extra Azure storage account, and all data is in the Fabric workspace. You can set up this approach in Microsoft Power Platform portal. For more information, see Ingest data with Fabric Link.
Transform the data
Regardless of which preceding approach you choose, the next step is to run the data pipeline to transform the data into the ProcessedESGData lakehouse. The following example outlines the process of running the TransformMSMDatatoProcessedESGData_DTPL data pipeline to transform the data into the ProcessedESGData lakehouse.
To accomplish this task, the data pipeline implements the following three pipeline activities:
Run the LoadMSMDataToLakehouseTables_INTB notebook. Use this notebook to load the integrated Microsoft Sustainability Manager data (present in CSV format) to lakehouse tables. The system deactivates this step when you use the Fabric to Dataverse shortcut approach, but it's required for the other two approaches.
Run the TransformMSMDataToProcessedESGData_INTB notebook. The first of two stages transforms Microsoft Sustainability Manager data to the intermediary raw data import tables. As part of this process, the system transforms the carbon emissions, water, and waste measurement data into a set of intermediary tables, called raw data import tables, to lower the complexity of the transformation process.
Run the TransformRawImportESGDataToProcessedESGData_INTB notebook. This step is the second stage, and it transforms the data from raw data import tables to the ESG data model schema.
After the transformation completes, you can unify and harmonize your source environmental data with the reference data and tables that are present in the ESG data model. Now, you can use the transformed data to compute metrics, generate reports, and perform analytics.