Notitie
Voor toegang tot deze pagina is autorisatie vereist. U kunt proberen u aan te melden of mappen te wijzigen.
Voor toegang tot deze pagina is autorisatie vereist. U kunt proberen om mappen te wijzigen.
Important
Some or all of this functionality is available as part of a preview release. The content and the functionality are subject to change.
This example shows how to add a Dynamics 365 Sales optionset as a reusable dimension in Nonprofit data solutions and then use it in downstream reporting.
Two scenarios are covered:
- A local optionset such as
opportunityratingcode - A global optionset such as
need
Data flow:
Local OptionSet: Dynamics 365 Sales -> Bronze OptionsetMetadata -> Silver Dimension -> Gold Dimension -> Report
Global OptionSet: Dynamics 365 Sales -> Bronze GlobalOptionsetMetadata -> Silver Dimension -> Gold Dimension -> Report
Understand optionset types
Dynamics 365 Sales supports two kinds of optionsets.
| Type | Description | is_global |
Bronze table |
|---|---|---|---|
| Local optionset | Defined on one entity | False |
OptionsetMetadata |
| Global optionset | Shared across entities | True |
GlobalOptionsetMetadata |
The framework reads from the correct bronze metadata table based on the is_global flag.
Example mappings used in this article
| Example | Type | Entity | Optionset name | Silver table | Gold table |
|---|---|---|---|---|---|
| Rating | Local | opportunity |
opportunityratingcode |
Rating |
DimRating |
| Need | Global | opportunity |
need |
Need |
DimNeed |
Add the silver schema
Open the Fundraising_SL_CreateSchema notebook and add schema methods for the new optionset dimensions.
Add the Rating schema
@staticmethod
def get_rating_schema() -> StructType:
return StructType([
StructField("RatingId", StringType(), nullable=False, metadata={ "primaryKey": True, "pkType": "guid" }),
StructField("CreatedDate", TimestampType()),
StructField("ModifiedDate", TimestampType()),
StructField("Name", StringType()),
StructField("SourceId", StringType(), nullable=False),
StructField("SourceSystemId", StringType())
])
Add the Need schema
@staticmethod
def get_need_schema() -> StructType:
return StructType([
StructField("NeedId", StringType(), nullable=False, metadata={ "primaryKey": True, "pkType": "guid" }),
StructField("CreatedDate", TimestampType()),
StructField("ModifiedDate", TimestampType()),
StructField("Name", StringType()),
StructField("SourceId", StringType(), nullable=False),
StructField("SourceSystemId", StringType())
])
Register both tables in get_entities().
def get_entities(self) -> list[tuple[StructType, str, bool]]:
return [
# ... existing tables ...
(NonprofitSilverModel.get_rating_schema(), "Rating", True),
(NonprofitSilverModel.get_need_schema(), "Need", True),
]
Note
Set the third parameter to True so the table supports Change Data Feed (CDF) processing in the gold layer.
Add the gold schema
Open the Fundraising_GD_CreateSchema notebook and add dimension schemas for the gold layer.
Gold dimension tables for optionsets are simple: surrogate key, business key, and display name.
Add the DimRating schema
@staticmethod
def get_dimrating_schema() -> StructType:
return StructType([
StructField("RatingName", StringType(), nullable=False, metadata={ "primaryKey": True }),
StructField("RatingKey", LongType(), nullable=False),
StructField("RatingId", StringType(), nullable=False)
])
Add the DimNeed schema
@staticmethod
def get_dimneed_schema() -> StructType:
return StructType([
StructField("NeedName", StringType(), nullable=False, metadata={ "primaryKey": True }),
StructField("NeedKey", LongType(), nullable=False),
StructField("NeedId", StringType(), nullable=False)
])
Register both gold tables in get_entities().
def get_entities(self) -> list[tuple[StructType, str, bool]]:
return [
# ... existing tables ...
(NonprofitGoldModel.get_dimrating_schema(), "DimRating", True),
(NonprofitGoldModel.get_dimneed_schema(), "DimNeed", True),
]
This pattern follows the existing optionset dimensions in the solution, such as DimCampaignType: a compact gold structure with a surrogate key, business key, and display name.
Sync the optionset values into silver
Open the Fundraising_D365_Transform notebook.
Option A: Add a local optionset to optionsets_to_sync
Use this option for local optionsets only.
The optionsets_to_sync list uses the default is_global=False, so this shorthand works only for local optionsets.
optionsets_to_sync = [
# ... existing optionsets ...
("opportunity", "opportunityratingcode", "Rating", "RatingId"),
]
Option B: Manually sync the optionset
Use this pattern when you need explicit control or when you're working with a global optionset.
Local optionset example
Add a markdown cell such as:
### Sync: Rating (Local OptionSet)
optionset_sync.sync_optionset(
entity_name="opportunity",
optionset_name="opportunityratingcode",
target_table="Rating",
target_primary_key="RatingId",
is_global=False
)
Global optionset example
Add a markdown cell such as:
### Sync: Need (Global OptionSet)
optionset_sync.sync_optionset(
entity_name="",
optionset_name="need",
target_table="Need",
target_primary_key="NeedId",
is_global=True
)
Important
Global optionsets require is_global=True. Otherwise, the framework reads the wrong bronze metadata table.
Use the is_global parameter to match the correct bronze metadata source.
| Value | Bronze table | Filter column | Use case |
|---|---|---|---|
False |
OptionsetMetadata |
OptionSetName |
Local optionsets bound to a specific entity |
True |
GlobalOptionsetMetadata |
GlobalOptionSetName |
Global optionsets shared across entities |
For global optionsets, entity_name can be an empty string because the definition isn't entity-specific.
Use the optionset as a foreign key
If you want the optionset to also appear on a silver transactional table, update the transformation and foreign key mapping.
For example, to use Rating and Need on Opportunity:
def transform_opportunity(df: DataFrame) -> DataFrame:
return df.select(
col("Id"),
# ... existing columns ...
col("opportunityratingcode_SilverRecordId").alias("RatingId"),
col("need_SilverRecordId").alias("NeedId"),
)
data_sync.sync_table(
source_table="opportunity",
source_primary_key="Id",
source_columns=[
"Id", "campaignid", "customerid", "customerid_entitytype",
"actualclosedate", "createdon", "modifiedon",
"estimatedvalue_base", "name", "statecode",
"opportunityratingcode", "need"
],
fk_mappings={
"campaignid": "Campaign",
"opportunityratingcode": "Rating",
"need": "Need"
}
)
Also update the silver Opportunity schema.
StructField("RatingId", StringType()),
StructField("NeedId", StringType()),
Enrich the optionset tables into gold
Open the Fundraising_SL_GD_Enrichment notebook and add gold enrichment sections for the new dimensions.
Add DimRating
def EnrichDimRating(df: DataFrame) -> DataFrame:
return (
df
.select("RatingId", col("Name").alias("RatingName"))
.dropna(subset=["RatingId"])
.withColumn("RatingKey", xxhash64(col("RatingId")).cast("bigint"))
.select("RatingKey", "RatingId", "RatingName")
)
ratingTable = CdfTable(
source_table_name="Rating",
source_primary_key="RatingId",
target_table_name="DimRating",
columns=["RatingId", "Name", "CreatedDate", "ModifiedDate", "SourceId", "SourceSystemId"],
merge_sql_template=f"""
MERGE INTO {gold_lakehouse_name}.DimRating AS target
USING latestSnapshot_Rating AS source
ON target.RatingId = source.RatingId
WHEN MATCHED THEN UPDATE SET
RatingName = source.RatingName
WHEN NOT MATCHED THEN INSERT (
RatingKey, RatingId, RatingName
) VALUES (
source.RatingKey, source.RatingId, source.RatingName
)
""",
source_lakehouse=silver_lakehouse_name,
target_lakehouse=gold_lakehouse_name,
enrich_func=EnrichDimRating
)
ProcessCdfTable(ratingTable)
Add DimNeed
def EnrichDimNeed(df: DataFrame) -> DataFrame:
return (
df
.select("NeedId", col("Name").alias("NeedName"))
.dropna(subset=["NeedId"])
.withColumn("NeedKey", xxhash64(col("NeedId")).cast("bigint"))
.select("NeedKey", "NeedId", "NeedName")
)
needTable = CdfTable(
source_table_name="Need",
source_primary_key="NeedId",
target_table_name="DimNeed",
columns=["NeedId", "Name", "CreatedDate", "ModifiedDate", "SourceId", "SourceSystemId"],
merge_sql_template=f"""
MERGE INTO {gold_lakehouse_name}.DimNeed AS target
USING latestSnapshot_Need AS source
ON target.NeedId = source.NeedId
WHEN MATCHED THEN UPDATE SET
NeedName = source.NeedName
WHEN NOT MATCHED THEN INSERT (
NeedKey, NeedId, NeedName
) VALUES (
source.NeedKey, source.NeedId, source.NeedName
)
""",
source_lakehouse=silver_lakehouse_name,
target_lakehouse=gold_lakehouse_name,
enrich_func=EnrichDimNeed
)
ProcessCdfTable(needTable)
Note
Unlike fact tables, optionset dimensions usually don't need hard_delete. Optionset values are typically stable and aren't deleted from the source system.
Update a fact table if you use the new foreign keys
If you add RatingId or NeedId to a fact table, update the gold fact schema, enrichment logic, and merge template so the new surrogate keys flow into the fact table.
For example, in EnrichFactOpportunity():
dim_rating = get_gold_table("DimRating").select("RatingId", "RatingKey").alias("dr")
dim_need = get_gold_table("DimNeed").select("NeedId", "NeedKey").alias("dn")
new_df = (
df.join(dim_campaign, F.col("o.CampaignId") == F.col("dc.CampaignId"), "left")
.join(dim_rating, F.col("o.RatingId") == F.col("dr.RatingId"), "left")
.join(dim_need, F.col("o.NeedId") == F.col("dn.NeedId"), "left")
.select(
# ... existing columns ...
F.col("dr.RatingKey"),
F.col("dn.NeedKey"),
)
)
If you extend a fact table with these keys, update all three places:
- the fact table schema in
Fundraising_GD_CreateSchema - the gold enrichment select list
- the fact table merge statement in
Fundraising_SL_GD_Enrichment
In an existing deployment, add the new columns to the already-created gold table before rerunning the pipeline.
spark.sql(f"ALTER TABLE {gold_lakehouse_name}.FactOpportunity ADD COLUMN RatingKey BIGINT")
spark.sql(f"ALTER TABLE {gold_lakehouse_name}.FactOpportunity ADD COLUMN NeedKey BIGINT")
Run and verify
- Run the
Fundraising_Orchestrationpipeline. - Verify that the new silver tables contain data.
- Verify that the new gold dimensions contain data.
- If necessary, add the new dimensions to the semantic model.
For example:
SELECT * FROM silver_lakehouse.Rating;
SELECT * FROM gold_lakehouse.DimRating;
For semantic model guidance, see Example: Add a Dynamics 365 Sales table to a report.
Troubleshoot common issues
The optionset table is empty
Check the bronze metadata table that matches your is_global setting.
For a local optionset (is_global=False):
SELECT * FROM bronze_lakehouse.OptionsetMetadata
WHERE EntityName = 'opportunity'
AND OptionSetName = 'opportunityratingcode'
For a global optionset (is_global=True):
SELECT * FROM bronze_lakehouse.GlobalOptionsetMetadata
WHERE EntityName = 'opportunity'
AND GlobalOptionSetName = 'need'
Also verify that the is_global flag matches the source metadata table:
- If data is in
GlobalOptionsetMetadata->is_global=True - If data is in
OptionsetMetadata->is_global=False
The foreign key resolves to NULL
Verify these items:
- The silver dimension table contains rows before you run the entity transformation.
- The
fk_mappingsentry matches the bronze column name exactly, which in Dataverse Link output is typically lowercase. - The
SourceSystemIdMappingtable contains rows for the new dimension.
You see duplicate optionset values
If the source contains multiple localized labels, filter to the label language you need, for example LocalizedLabelLanguageCode = 1033 for English. The out-of-box framework typically uses the correct metadata pattern, but the source should still be validated.
You need to customize the optionset values
Use a custom transform to add or change values during the sync.
def transform_rating(df: DataFrame) -> DataFrame:
unknown = df.sparkSession.createDataFrame(
[("Unknown", "-1")],
["Name", "SourceSystemId"]
)
return df.unionByName(unknown, allowMissingColumns=True)
optionset_sync.sync_optionset(
entity_name="opportunity",
optionset_name="opportunityratingcode",
target_table="Rating",
target_primary_key="RatingId",
is_global=False,
transform_func=transform_rating
)