Deploy ESG data estate and ingest ESG data

Completed

In this exercise, you get started with data ingestion. You deploy the Environmental, Social, and Governance (ESG) data estate, load demo data, and then query the data in the lakehouse.

Deploy ESG data estate

To fully deploy the ESG data estate, you need to follow these general steps:

  1. Create a Microsoft Fabric workspace.

  2. Deploy sustainability data solutions in Fabric.

  3. Deploy ESG data estate.

Create workspace

To create a workspace, follow these steps:

  1. Go to Microsoft Power BI.

  2. Select Workspaces.

  3. Select + New workspace.

  4. Enter FL C. Sustainability for Name, replacing FL with your initials. Select Apply.

    Screenshot of the new workspace with the Name field filled in.

  5. Stay on this page.

Deploy sustainability solutions

To deploy sustainability solutions, follow these steps:

  1. Make sure that you select the workspace that you created and then select + New item.

  2. Search for sustainability and select Sustainability solutions.

    Screenshot with the Sustainability solutions option selected.

  3. Enter FL_Sustainability_solutions for Name, replacing FL with your initials. Select Create.

    Screenshot of the sustainability solution with the name specified.

  4. After the deployment completes, go to the sustainability solution and then stay on this page.

Deploy ESG data estate

To deploy the ESG data estate, follow these steps:

  1. Make sure that you have the sustainability solutions selected. Locate and select ESG data estate.

    Screenshot of the E S G data estate solution selected in the Solution capabilities section.

  2. Review the items that deploy with the solution.

    Screenshot of the capability and the solutions that the deployment includes.

  3. Select Deploy to workspace.

  4. Read the terms and conditions and then select Accept & deploy.

  5. Wait for the solution deployment to complete.

    Screenshot of the capability and the items that deploy with it.

Load demo data

In this section, you load the demo data for emissions, water, waste, social, and governance sustainability areas in the ESG data model schema. This notebook loads the demo data as tables in the ProcessedESGData lakehouse for populating the dimensional fact tables and metric tables and for exploring the prebuilt CSRDMetricsReport.

  1. Go to Power BI and select the FL C. Sustainability workspace that you created.

  2. Locate and open the SDS_ESGDE_FL_Sustainability_solutions_LoadDemoDataInProcessedESGDataTables_INTB notebook.

    Screenshot of the notebook highlighted.

  3. Read the notebook.

  4. Select Run all and then wait for the run to complete.

Open the processed ESG data lakehouse

In this task, you review the created demo data tables.

  1. Select the FL C. Sustainability workspace that you created.

  2. Locate and open the SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH lakehouse.

    Screenshot of the lakehouse selected.

  3. Expand Tables and review the created tables. If the tables don't display, select the ellipsis (...) button of Tables and then select Refresh.

    Screenshot of the ellipsis button selected next to Tables and the Refresh option highlighted.

  4. From the upper right, switch from Lakehouse to SQL analytics endpoint.

    Screenshot of the sequel analytics endpoint selected.

  5. Select New SQL query.

    Screenshot of the new sequel query option selected.

  6. Paste the following query.

    This query returns all tables and the count of their rows.

    
      DECLARE @SQL NVARCHAR(MAX) = N'';
    
      SELECT @SQL += 'SELECT ''' + t.name + ''' AS TableName, COUNT(*) AS TotalRows FROM ' + QUOTENAME(t.name) + ' HAVING COUNT(*) > 0 UNION ALL '
      FROM sys.tables t;
    
      -- REMOVE the LAST 'UNION ALL' AND ADD an ORDER BY clause
      SET @SQL = LEFT(@SQL, LEN(@SQL) - 10) + ' ORDER BY TotalRows DESC;';
    
      -- EXECUTE the DYNAMIC SQL
      EXEC sp_executesql @SQL;
    
    
  7. Select the Run button. The query returns the count of rows for each table.

    Screenshot of the query results.

  8. Select New SQL query again.

  9. Paste the following query, replacing all FL instances with your initials.

    This query calculates the total greenhouse gas emissions for each facility, broken down by scope category and greenhouse gas type. It aggregates data from the PartyScopeCategoryGreenhouseGas, GreenhouseGasEmissionFactor, and PartyGreenhouseGas tables.

    
    WITH  EmissionFactors AS (
      SELECT 
    ​    ghf.GreenhouseGasEmissionFactorId,
    ​    ghf.GreenhouseGasId,
    ​    ghf.Co2EEmissions AS EmissionFactorCo2,
    ​    ghf.PerGreenhouseGasEmissionsUnitOfMeasureId
      FROM 
    ​    [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[GreenhouseGasEmissionFactor] AS ghf
    ),
    
    ScopeCategoryEmissions AS (
      SELECT 
    ​    ps.PartyId,
    ​    ps.GreenhouseGasId,
    ​    ps.GreenhouseGasEmissionsScopeId,
    ​    ps.Co2EEmissions AS ScopeCategoryCo2,
    ​    ps.GreenhouseGasEmissionFactorId,
    ​    ps.GreenhouseGasEmissionsUnitOfMeasureId
      FROM 
    
    ​    [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[PartyScopeCategoryGreenhouseGas] AS ps
    ),
    
    GreenhouseGasEmissions AS (
      SELECT 
    ​    pgh.PartyId,
    ​    pgh.GreenhouseGasId,
    ​    pgh.GreenhouseGasEmissionFactorId,
    ​    pgh.Co2EEmissions AS TotalCo2Emissions,
    ​    pgh.GreenhouseGasEmissionsUnitOfMeasureId
      FROM 
    ​    [SDS_ESGDE_FL_Sustainability_solutions_ProcessedESGData_LH].[dbo].[PartyGreenhouseGas] AS pgh
    )
    
    SELECT 
      COALESCE(sce.PartyId, gge.PartyId) AS FacilityId,
      COALESCE(sce.GreenhouseGasId, gge.GreenhouseGasId, ef.GreenhouseGasId) AS GreenhouseGasId,
      sce.GreenhouseGasEmissionsScopeId AS EmissionScopeCategory,
      SUM(COALESCE(gge.TotalCo2Emissions, 0) * COALESCE(ef.EmissionFactorCo2, 1)) AS TotalEmissionsCo2Equivalent,
      sce.GreenhouseGasEmissionsUnitOfMeasureId
    FROM 
      ScopeCategoryEmissions AS sce
    LEFT JOIN 
      EmissionFactors AS ef ON sce.GreenhouseGasEmissionFactorId = ef.GreenhouseGasEmissionFactorId
    LEFT JOIN 
      GreenhouseGasEmissions AS gge ON sce.PartyId = gge.PartyId AND sce.GreenhouseGasId = gge.GreenhouseGasId
    GROUP BY 
      COALESCE(sce.PartyId, gge.PartyId), 
      COALESCE(sce.GreenhouseGasId, gge.GreenhouseGasId, ef.GreenhouseGasId), 
      sce.GreenhouseGasEmissionsScopeId, 
      sce.GreenhouseGasEmissionsUnitOfMeasureId
    ORDER BY 
      TotalEmissionsCo2Equivalent DESC;
    
    

    Screenshot of the query that you paste to calculate the total greenhouse gas emissions for each facility.

  10. Select Run.

  11. The query results display as shown in the following image.

    Screenshot of the Results page of the query.