Deploy ESG data estate and ingest ESG data
In this exercise, you get started with data ingestion. You deploy the Environmental, Social, and Governance (ESG) data estate, load demo data, and then query the data in the lakehouse.
Deploy ESG data estate
To fully deploy the ESG data estate, you need to follow these general steps:
Create a Microsoft Fabric workspace.
Deploy sustainability data solutions in Fabric.
Deploy ESG data estate.
Create workspace
To create a workspace, follow these steps:
Go to Microsoft Power BI.
Select Workspaces.
Select + New workspace.
Enter FL C. Sustainability for Name, replacing FL with your initials. Select Apply.
Stay on this page.
Deploy sustainability solutions
To deploy sustainability solutions, follow these steps:
Make sure that you select the workspace that you created and then select + New item.
Search for sustainability and select Sustainability solutions.
Enter FL_Sustainability_solutions for Name, replacing FL with your initials. Select Create.
After the deployment completes, go to the sustainability solution and then stay on this page.
Deploy ESG data estate
To deploy the ESG data estate, follow these steps:
Make sure that you have the sustainability solutions selected. Locate and select ESG data estate.
Review the items that deploy with the solution.
Select Deploy to workspace.
Read the terms and conditions and then select Accept & deploy.
Wait for the solution deployment to complete.
Load demo data
In this section, you load the demo data for emissions, water, waste, social, and governance sustainability areas in the ESG data model schema. This notebook loads the demo data as tables in the ProcessedESGData lakehouse for populating the dimensional fact tables and metric tables and for exploring the prebuilt CSRDMetricsReport.
Go to Power BI and select the FL C. Sustainability workspace that you created.
Locate and open the SDS_ESGDE_FL_Sustainability_solutions_LoadDemoDataInProcessedESGDataTables_INTB notebook.
Read the notebook.
Select Run all and then wait for the run to complete.
Open the processed ESG data lakehouse
In this task, you review the created demo data tables.
Select the FL C. Sustainability workspace that you created.
Locate and open the SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH lakehouse.
Expand Tables and review the created tables. If the tables don't display, select the ellipsis (...) button of Tables and then select Refresh.
From the upper right, switch from Lakehouse to SQL analytics endpoint.
Select New SQL query.
Paste the following query.
This query returns all tables and the count of their rows.
DECLARE @SQL NVARCHAR(MAX) = N''; SELECT @SQL += 'SELECT ''' + t.name + ''' AS TableName, COUNT(*) AS TotalRows FROM ' + QUOTENAME(t.name) + ' HAVING COUNT(*) > 0 UNION ALL ' FROM sys.tables t; -- REMOVE the LAST 'UNION ALL' AND ADD an ORDER BY clause SET @SQL = LEFT(@SQL, LEN(@SQL) - 10) + ' ORDER BY TotalRows DESC;'; -- EXECUTE the DYNAMIC SQL EXEC sp_executesql @SQL;
Select the Run button. The query returns the count of rows for each table.
Select New SQL query again.
Paste the following query, replacing all FL instances with your initials.
This query calculates the total greenhouse gas emissions for each facility, broken down by scope category and greenhouse gas type. It aggregates data from the PartyScopeCategoryGreenhouseGas, GreenhouseGasEmissionFactor, and PartyGreenhouseGas tables.
WITH EmissionFactors AS ( SELECT ghf.GreenhouseGasEmissionFactorId, ghf.GreenhouseGasId, ghf.Co2EEmissions AS EmissionFactorCo2, ghf.PerGreenhouseGasEmissionsUnitOfMeasureId FROM [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[GreenhouseGasEmissionFactor] AS ghf ), ScopeCategoryEmissions AS ( SELECT ps.PartyId, ps.GreenhouseGasId, ps.GreenhouseGasEmissionsScopeId, ps.Co2EEmissions AS ScopeCategoryCo2, ps.GreenhouseGasEmissionFactorId, ps.GreenhouseGasEmissionsUnitOfMeasureId FROM [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[PartyScopeCategoryGreenhouseGas] AS ps ), GreenhouseGasEmissions AS ( SELECT pgh.PartyId, pgh.GreenhouseGasId, pgh.GreenhouseGasEmissionFactorId, pgh.Co2EEmissions AS TotalCo2Emissions, pgh.GreenhouseGasEmissionsUnitOfMeasureId FROM [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[PartyGreenhouseGas] AS pgh ) SELECT COALESCE(sce.PartyId, gge.PartyId) AS FacilityId, COALESCE(sce.GreenhouseGasId, gge.GreenhouseGasId, ef.GreenhouseGasId) AS GreenhouseGasId, sce.GreenhouseGasEmissionsScopeId AS EmissionScopeCategory, SUM(COALESCE(gge.TotalCo2Emissions, 0) * COALESCE(ef.EmissionFactorCo2, 1)) AS TotalEmissionsCo2Equivalent, sce.GreenhouseGasEmissionsUnitOfMeasureId FROM ScopeCategoryEmissions AS sce LEFT JOIN EmissionFactors AS ef ON sce.GreenhouseGasEmissionFactorId = ef.GreenhouseGasEmissionFactorId LEFT JOIN GreenhouseGasEmissions AS gge ON sce.PartyId = gge.PartyId AND sce.GreenhouseGasId = gge.GreenhouseGasId GROUP BY COALESCE(sce.PartyId, gge.PartyId), COALESCE(sce.GreenhouseGasId, gge.GreenhouseGasId, ef.GreenhouseGasId), sce.GreenhouseGasEmissionsScopeId, sce.GreenhouseGasEmissionsUnitOfMeasureId ORDER BY TotalEmissionsCo2Equivalent DESC;
Select Run.
The query results display as shown in the following image.