What is semantic link (preview)?
Semantic link is a feature that allows you to establish a connection between semantic models and Synapse Data Science in Microsoft Fabric. Use of semantic link is only supported in Microsoft Fabric.
This feature is in preview.
The primary goals of semantic link are to facilitate data connectivity, enable the propagation of semantic information, and seamlessly integrate with established tools used by data scientists, such as notebooks. semantic link helps you to preserve domain knowledge about data semantics in a standardized way that can speed up data analysis and reduce errors.
Overview of semantic link
The data flow starts with semantic models that contain data and semantic information. Semantic link bridges the gap between Power BI and the Data Science experience.
With semantic link, you can use semantic models from Power BI in the Data Science experience to perform tasks such as in-depth statistical analysis and predictive modeling with machine learning techniques. The output of your data science work can be stored in OneLake using Apache Spark and ingested into Power BI using Direct Lake.
Power BI connectivity
Semantic models serve as the single tabular object model, providing a reliable source for semantic definitions, such as Power BI measures. To connect to semantic models:
- Semantic link offers data connectivity to the Python pandas ecosystem via the SemPy Python library, making it easy for data scientists to work with the data.
- Semantic link provides access to semantic models through the Spark native connector for data scientists that are more familiar with the Apache Spark ecosystem. This implementation supports various languages, including PySpark, Spark SQL, R, and Scala.
Applications of semantic information
Semantic information in data includes Power BI data categories such as address and postal code, relationships between tables, and hierarchical information. These data categories comprise metadata that semantic link propagates into the Data Science environment to enable new experiences and maintain data lineage. Some example applications of semantic link are:
- Intelligent suggestions of built-in semantic functions.
- Innovative integration for augmenting data with Power BI measures through the use of add-measures.
- Tools for data quality validation based on the relationships between tables and functional dependencies within tables.
Semantic link is a powerful tool that enables business analysts to use data effectively in a comprehensive data science environment. Semantic link facilitates seamless collaboration between data scientists and business analysts by eliminating the need to reimplement business logic embedded in Power BI measures. This approach ensures that both parties can work efficiently and productively, maximizing the potential of their data-driven insights.
FabricDataFrame data structure
FabricDataFrame is the core data structure of semantic link. It subclasses the pandas DataFrame and adds metadata, such as semantic information and lineage. FabricDataFrame is the primary data structure that semantic link uses to propagate semantic information from semantic models into the Data Science environment.
FabricDataFrame supports all pandas operations and more. It exposes semantic functions and the add-measure method that enable you to use Power BI measures in your data science work.
- Deepen your expertise of SemPy through the SemPy reference documentation
- Tutorial: Clean data with functional dependencies (preview)
- Learn more about semantic link and Power BI connectivity (preview)
- How to validate data with semantic link (preview)
- Explore and validate relationships in semantic models (preview)