Редагувати

Поділитися через


What is data warehousing in Microsoft Fabric?

Applies to: ✅ SQL analytics endpoint and Warehouse in Microsoft Fabric

Microsoft Fabric is a next-generation data warehousing solution within Microsoft Fabric.

The lake-centric warehouse is built on an enterprise grade distributed processing engine that enables industry leading performance at scale while minimizing the need for configuration and management. Living in the data lake and designed to natively support open data formats, Fabric data warehouse enables seamless collaboration between data engineers and business users without compromising security or governance.

The easy-to-use SaaS experience is also tightly integrated with Power BI for easy analysis and reporting, converging the world of data lakes and warehouses and greatly simplifying an organizations investment in their analytics estate. 

Data warehouse customers benefit from:

  • Data stored in Delta-parquet format enables ACID transactions and interoperability with other Fabric workloads means you don't need multiple copies of data.
  • Cross database queries can use multiple data sources for fast insights with zero data duplication.
  • Easily ingest, load and transform data at scale through Pipelines, Dataflows, cross database query or the COPY INTO command.
  • Autonomous workload management with industry-leading distributed query processing engine means no knobs to turn to achieve best in class performance.
  • Scale near instantaneously to meet business demands. Storage and compute are separated.
  • Reduced time to insights with an easily consumable, always connected semantic model that is integrated with Power BI in Direct Lake mode. Reports always have the most recent data for analysis and reporting.
  • Built for any skill level, from the citizen developer to DBA or data engineer.

Data warehousing items

Fabric Warehouse is not a traditional enterprise data warehouse, it's a lake warehouse that supports two distinct warehousing items: the Fabric data warehouse and the SQL analytics endpoint. Both are purpose-built to meet customers' business needs while providing best in class performance, minimizing costs, and reduced administrative overhead.

Synapse Data Warehouse

In a Microsoft Fabric workspace, a Synapse Data Warehouse or Warehouse is labeled as 'Warehouse' in the Type column. When you need the full power and transactional capabilities (DDL and DML query support) of a data warehouse, this is the fast and simple solution for you.

Screenshot showing the Warehouse type in workspace.

The warehouse can be populated by any one of the supported data ingestion methods such as COPY INTOPipelinesDataflows, or cross database ingestion options such as CREATE TABLE AS SELECT (CTAS)INSERT..SELECT, or SELECT INTO.

To get started with the Warehouse, see:

SQL analytics endpoint of the Lakehouse

In a Microsoft Fabric workspace, each Lakehouse has an autogenerated "SQL analytics endpoint" which can be used to transition from the "Lake" view of the Lakehouse (which supports data engineering and Apache Spark) to the "SQL" view of the same Lakehouse to create views, functions, stored procedures, and apply SQL security.

Screenshot showing the SQL analytics endpoint type in workspace.

With the SQL analytics endpoint of the Lakehouse, T-SQL commands can define and query data objects but not manipulate or modify the data. You can perform the following actions in the SQL analytics endpoint:

  • Query the tables that reference data in your Delta Lake folders in the lake.
  • Create views, inline TVFs, and procedures to encapsulate your semantics and business logic in T-SQL.
  • Manage permissions on the objects.

To get started with the SQL analytics endpoint, see:

Warehouse or lakehouse

When deciding between using a warehouse or a lakehouse, it's important to consider the specific needs and context of your data management and analytics requirements. Equally important, this is not a one way decision!

You always have the opportunity to add one or the other at a later point should your business needs change and regardless of where you start, both the warehouse and the lakehouse use the same powerful SQL engine for all T-SQL queries.

Here are some general guidelines to help you make the decision:

  • Choose a data warehouse when you need an enterprise-scale solution with open standard format, no knobs performance, and minimal setup.  Best suited for semi-structured and structured data formats, the data warehouse is suitable for both beginner and experienced data professionals, offering simple and intuitive experiences.

  • Choose a lakehouse when you need a large repository of highly unstructured data from heterogeneous sources, leveraging low-cost object storage and want to use SPARK as your primary development tool. Acting as a 'lightweight' data warehouse, you always have the option to use the SQL endpoint and T-SQL tools to deliver reporting and data intelligence scenarios in your lakehouse.

For more detailed decision guidance, see Microsoft Fabric decision guide: Choose between Warehouse and Lakehouse.