Example: Add a Dynamics 365 Sales optionset to a report

Important

Some or all of this functionality is available as part of a preview release. The content and the functionality are subject to change.

This example shows how to add a Dynamics 365 Sales optionset as a reusable dimension in Nonprofit data solutions and then use it in downstream reporting.

Two scenarios are covered:

  1. A local optionset such as opportunityratingcode
  2. A global optionset such as need

Data flow:

Local OptionSet:  Dynamics 365 Sales -> Bronze OptionsetMetadata       -> Silver Dimension -> Gold Dimension -> Report
Global OptionSet: Dynamics 365 Sales -> Bronze GlobalOptionsetMetadata -> Silver Dimension -> Gold Dimension -> Report

Understand optionset types

Dynamics 365 Sales supports two kinds of optionsets.

Type Description is_global Bronze table
Local optionset Defined on one entity False OptionsetMetadata
Global optionset Shared across entities True GlobalOptionsetMetadata

The framework reads from the correct bronze metadata table based on the is_global flag.

Example mappings used in this article

Example Type Entity Optionset name Silver table Gold table
Rating Local opportunity opportunityratingcode Rating DimRating
Need Global opportunity need Need DimNeed

Add the silver schema

Open the Fundraising_SL_CreateSchema notebook and add schema methods for the new optionset dimensions.

Add the Rating schema

@staticmethod
def get_rating_schema() -> StructType:
    return StructType([
        StructField("RatingId", StringType(), nullable=False, metadata={ "primaryKey": True, "pkType": "guid" }),
        StructField("CreatedDate", TimestampType()),
        StructField("ModifiedDate", TimestampType()),
        StructField("Name", StringType()),
        StructField("SourceId", StringType(), nullable=False),
        StructField("SourceSystemId", StringType())
    ])

Add the Need schema

@staticmethod
def get_need_schema() -> StructType:
    return StructType([
        StructField("NeedId", StringType(), nullable=False, metadata={ "primaryKey": True, "pkType": "guid" }),
        StructField("CreatedDate", TimestampType()),
        StructField("ModifiedDate", TimestampType()),
        StructField("Name", StringType()),
        StructField("SourceId", StringType(), nullable=False),
        StructField("SourceSystemId", StringType())
    ])

Register both tables in get_entities().

def get_entities(self) -> list[tuple[StructType, str, bool]]:
    return [
        # ... existing tables ...
        (NonprofitSilverModel.get_rating_schema(), "Rating", True),
        (NonprofitSilverModel.get_need_schema(), "Need", True),
    ]

Note

Set the third parameter to True so the table supports Change Data Feed (CDF) processing in the gold layer.

Add the gold schema

Open the Fundraising_GD_CreateSchema notebook and add dimension schemas for the gold layer.

Gold dimension tables for optionsets are simple: surrogate key, business key, and display name.

Add the DimRating schema

@staticmethod
def get_dimrating_schema() -> StructType:
    return StructType([
        StructField("RatingName", StringType(), nullable=False, metadata={ "primaryKey": True }),
        StructField("RatingKey", LongType(), nullable=False),
        StructField("RatingId", StringType(), nullable=False)
    ])

Add the DimNeed schema

@staticmethod
def get_dimneed_schema() -> StructType:
    return StructType([
        StructField("NeedName", StringType(), nullable=False, metadata={ "primaryKey": True }),
        StructField("NeedKey", LongType(), nullable=False),
        StructField("NeedId", StringType(), nullable=False)
    ])

Register both gold tables in get_entities().

def get_entities(self) -> list[tuple[StructType, str, bool]]:
    return [
        # ... existing tables ...
        (NonprofitGoldModel.get_dimrating_schema(), "DimRating", True),
        (NonprofitGoldModel.get_dimneed_schema(), "DimNeed", True),
    ]

This pattern follows the existing optionset dimensions in the solution, such as DimCampaignType: a compact gold structure with a surrogate key, business key, and display name.

Sync the optionset values into silver

Open the Fundraising_D365_Transform notebook.

Option A: Add a local optionset to optionsets_to_sync

Use this option for local optionsets only.

The optionsets_to_sync list uses the default is_global=False, so this shorthand works only for local optionsets.

optionsets_to_sync = [
    # ... existing optionsets ...
    ("opportunity", "opportunityratingcode", "Rating", "RatingId"),
]

Option B: Manually sync the optionset

Use this pattern when you need explicit control or when you're working with a global optionset.

Local optionset example

Add a markdown cell such as:

### Sync: Rating (Local OptionSet)
optionset_sync.sync_optionset(
    entity_name="opportunity",
    optionset_name="opportunityratingcode",
    target_table="Rating",
    target_primary_key="RatingId",
    is_global=False
)

Global optionset example

Add a markdown cell such as:

### Sync: Need (Global OptionSet)
optionset_sync.sync_optionset(
    entity_name="",
    optionset_name="need",
    target_table="Need",
    target_primary_key="NeedId",
    is_global=True
)

Important

Global optionsets require is_global=True. Otherwise, the framework reads the wrong bronze metadata table.

Use the is_global parameter to match the correct bronze metadata source.

Value Bronze table Filter column Use case
False OptionsetMetadata OptionSetName Local optionsets bound to a specific entity
True GlobalOptionsetMetadata GlobalOptionSetName Global optionsets shared across entities

For global optionsets, entity_name can be an empty string because the definition isn't entity-specific.

Use the optionset as a foreign key

If you want the optionset to also appear on a silver transactional table, update the transformation and foreign key mapping.

For example, to use Rating and Need on Opportunity:

def transform_opportunity(df: DataFrame) -> DataFrame:
    return df.select(
        col("Id"),
        # ... existing columns ...
        col("opportunityratingcode_SilverRecordId").alias("RatingId"),
        col("need_SilverRecordId").alias("NeedId"),
    )
data_sync.sync_table(
    source_table="opportunity",
    source_primary_key="Id",
    source_columns=[
        "Id", "campaignid", "customerid", "customerid_entitytype",
        "actualclosedate", "createdon", "modifiedon",
        "estimatedvalue_base", "name", "statecode",
        "opportunityratingcode", "need"
    ],
    fk_mappings={
        "campaignid": "Campaign",
        "opportunityratingcode": "Rating",
        "need": "Need"
    }
)

Also update the silver Opportunity schema.

StructField("RatingId", StringType()),
StructField("NeedId", StringType()),

Enrich the optionset tables into gold

Open the Fundraising_SL_GD_Enrichment notebook and add gold enrichment sections for the new dimensions.

Add DimRating

def EnrichDimRating(df: DataFrame) -> DataFrame:
    return (
        df
        .select("RatingId", col("Name").alias("RatingName"))
        .dropna(subset=["RatingId"])
        .withColumn("RatingKey", xxhash64(col("RatingId")).cast("bigint"))
        .select("RatingKey", "RatingId", "RatingName")
    )

ratingTable = CdfTable(
    source_table_name="Rating",
    source_primary_key="RatingId",
    target_table_name="DimRating",
    columns=["RatingId", "Name", "CreatedDate", "ModifiedDate", "SourceId", "SourceSystemId"],
    merge_sql_template=f"""
    MERGE INTO {gold_lakehouse_name}.DimRating AS target
    USING latestSnapshot_Rating AS source
    ON target.RatingId = source.RatingId
    WHEN MATCHED THEN UPDATE SET
        RatingName = source.RatingName
    WHEN NOT MATCHED THEN INSERT (
        RatingKey, RatingId, RatingName
    ) VALUES (
        source.RatingKey, source.RatingId, source.RatingName
    )
    """,
    source_lakehouse=silver_lakehouse_name,
    target_lakehouse=gold_lakehouse_name,
    enrich_func=EnrichDimRating
)

ProcessCdfTable(ratingTable)

Add DimNeed

def EnrichDimNeed(df: DataFrame) -> DataFrame:
    return (
        df
        .select("NeedId", col("Name").alias("NeedName"))
        .dropna(subset=["NeedId"])
        .withColumn("NeedKey", xxhash64(col("NeedId")).cast("bigint"))
        .select("NeedKey", "NeedId", "NeedName")
    )

needTable = CdfTable(
    source_table_name="Need",
    source_primary_key="NeedId",
    target_table_name="DimNeed",
    columns=["NeedId", "Name", "CreatedDate", "ModifiedDate", "SourceId", "SourceSystemId"],
    merge_sql_template=f"""
    MERGE INTO {gold_lakehouse_name}.DimNeed AS target
    USING latestSnapshot_Need AS source
    ON target.NeedId = source.NeedId
    WHEN MATCHED THEN UPDATE SET
        NeedName = source.NeedName
    WHEN NOT MATCHED THEN INSERT (
        NeedKey, NeedId, NeedName
    ) VALUES (
        source.NeedKey, source.NeedId, source.NeedName
    )
    """,
    source_lakehouse=silver_lakehouse_name,
    target_lakehouse=gold_lakehouse_name,
    enrich_func=EnrichDimNeed
)

ProcessCdfTable(needTable)

Note

Unlike fact tables, optionset dimensions usually don't need hard_delete. Optionset values are typically stable and aren't deleted from the source system.

Update a fact table if you use the new foreign keys

If you add RatingId or NeedId to a fact table, update the gold fact schema, enrichment logic, and merge template so the new surrogate keys flow into the fact table.

For example, in EnrichFactOpportunity():

dim_rating = get_gold_table("DimRating").select("RatingId", "RatingKey").alias("dr")
dim_need = get_gold_table("DimNeed").select("NeedId", "NeedKey").alias("dn")

new_df = (
    df.join(dim_campaign, F.col("o.CampaignId") == F.col("dc.CampaignId"), "left")
      .join(dim_rating, F.col("o.RatingId") == F.col("dr.RatingId"), "left")
      .join(dim_need, F.col("o.NeedId") == F.col("dn.NeedId"), "left")
      .select(
          # ... existing columns ...
          F.col("dr.RatingKey"),
          F.col("dn.NeedKey"),
      )
)

If you extend a fact table with these keys, update all three places:

  • the fact table schema in Fundraising_GD_CreateSchema
  • the gold enrichment select list
  • the fact table merge statement in Fundraising_SL_GD_Enrichment

In an existing deployment, add the new columns to the already-created gold table before rerunning the pipeline.

spark.sql(f"ALTER TABLE {gold_lakehouse_name}.FactOpportunity ADD COLUMN RatingKey BIGINT")
spark.sql(f"ALTER TABLE {gold_lakehouse_name}.FactOpportunity ADD COLUMN NeedKey BIGINT")

Run and verify

  1. Run the Fundraising_Orchestration pipeline.
  2. Verify that the new silver tables contain data.
  3. Verify that the new gold dimensions contain data.
  4. If necessary, add the new dimensions to the semantic model.

For example:

SELECT * FROM silver_lakehouse.Rating;
SELECT * FROM gold_lakehouse.DimRating;

For semantic model guidance, see Example: Add a Dynamics 365 Sales table to a report.

Troubleshoot common issues

The optionset table is empty

Check the bronze metadata table that matches your is_global setting.

For a local optionset (is_global=False):

SELECT * FROM bronze_lakehouse.OptionsetMetadata
WHERE EntityName = 'opportunity'
  AND OptionSetName = 'opportunityratingcode'

For a global optionset (is_global=True):

SELECT * FROM bronze_lakehouse.GlobalOptionsetMetadata
WHERE EntityName = 'opportunity'
    AND GlobalOptionSetName = 'need'

Also verify that the is_global flag matches the source metadata table:

  • If data is in GlobalOptionsetMetadata -> is_global=True
  • If data is in OptionsetMetadata -> is_global=False

The foreign key resolves to NULL

Verify these items:

  • The silver dimension table contains rows before you run the entity transformation.
  • The fk_mappings entry matches the bronze column name exactly, which in Dataverse Link output is typically lowercase.
  • The SourceSystemIdMapping table contains rows for the new dimension.

You see duplicate optionset values

If the source contains multiple localized labels, filter to the label language you need, for example LocalizedLabelLanguageCode = 1033 for English. The out-of-box framework typically uses the correct metadata pattern, but the source should still be validated.

You need to customize the optionset values

Use a custom transform to add or change values during the sync.

def transform_rating(df: DataFrame) -> DataFrame:
    unknown = df.sparkSession.createDataFrame(
        [("Unknown", "-1")],
        ["Name", "SourceSystemId"]
    )
    return df.unionByName(unknown, allowMissingColumns=True)

optionset_sync.sync_optionset(
    entity_name="opportunity",
    optionset_name="opportunityratingcode",
    target_table="Rating",
    target_primary_key="RatingId",
    is_global=False,
    transform_func=transform_rating
)

See also