Редагувати

Поділитися через


Tables in data warehousing in Microsoft Fabric

Applies to: ✅ Warehouse in Microsoft Fabric

This article details key concepts for designing tables in Microsoft Fabric.

In tables, data is logically organized in a row-and-column format. Each row represents a unique record, and each column represents a field in the record.

  • In Warehouse, tables are database objects that contain all the transactional data.

Determine table category

A star schema organizes data into fact tables and dimension tables. Some tables are used for integration or staging data before moving to a fact or dimension table. As you design a table, decide whether the table data belongs in a fact, dimension, or integration table. This decision informs the appropriate table structure.

  • Fact tables contain quantitative data that are commonly generated in a transactional system, and then loaded into the data warehouse. For example, a retail business generates sales transactions every day, and then loads the data into a data warehouse fact table for analysis.

  • Dimension tables contain attribute data that might change but usually changes infrequently. For example, a customer's name and address are stored in a dimension table and updated only when the customer's profile changes. To minimize the size of a large fact table, the customer's name and address don't need to be in every row of a fact table. Instead, the fact table and the dimension table can share a customer ID. A query can join the two tables to associate a customer's profile and transactions.

  • Integration tables provide a place for integrating or staging data. For example, you can load data to a staging table, perform transformations on the data in staging, and then insert the data into a production table.

A table stores data in OneLake as part of the Warehouse. The table and the data persist whether or not a session is open.

Tables in the Warehouse

To show the organization of the tables, you could use fact, dim, or int as prefixes to the table names. The following table shows some of the schema and table names for WideWorldImportersDW sample data warehouse.

WideWorldImportersDW Source Table Name Table Type Data Warehouse Table Name
City Dimension wwi.DimCity
Order Fact wwi.FactOrder
  • Table names are case sensitive.
  • Table names can't contain / or \ or end with a ..

Create a table

For Warehouse, you can create a table as a new empty table. You can also create and populate a table with the results of a select statement. The following are the T-SQL commands for creating a table.

T-SQL Statement Description
CREATE TABLE Creates an empty table by defining all the table columns and options.
CREATE TABLE AS SELECT Populates a new table with the results of a select statement. The table columns and data types are based on the select statement results. To import data, this statement can select from an external table.

This example creates a table with two columns:

CREATE TABLE MyTable (col1 int, col2 int );  

Schema names

Warehouse supports the creation of custom schemas. Like in SQL Server, schemas are a good way to group together objects that are used in a similar fashion. The following code creates a user-defined schema called wwi.

  • Schema names are case sensitive.
  • Schema names can't contain / or \ or end with a ..
CREATE SCHEMA wwi;

Data types

Microsoft Fabric supports the most commonly used T-SQL data types.

Collation

Latin1_General_100_BIN2_UTF8 is the default collation for both tables and metadata.

You can create a warehouse with the case-insensitive (CI) collation Latin1_General_100_CI_AS_KS_WS_SC_UTF8. For more information, see How to: Create a warehouse with case insensitive (CI) collation.

Supported collations in the API are:

  • Latin1_General_100_BIN2_UTF8 (default)
  • Latin1_General_100_CI_AS_KS_WS_SC_UTF8

Once the collation is set during database creation, all subsequent objects (tables, columns, etc.) will inherit this default collation.

Statistics

The query optimizer uses column-level statistics when it creates the plan for executing a query. To improve query performance, it's important to have statistics on individual columns, especially columns used in query joins. Warehouse supports automatic creation of statistics.

Statistical updating doesn't happen automatically. Update statistics after a significant number of rows are added or changed. For instance, update statistics after a load. For more information, see Statistics.

Primary key, foreign key, and unique key

For Warehouse, PRIMARY KEY and UNIQUE constraint are only supported when NONCLUSTERED and NOT ENFORCED are both used.

FOREIGN KEY is only supported when NOT ENFORCED is used.

Align source data with the data warehouse

Warehouse tables are populated by loading data from another data source. To achieve a successful load, the number and data types of the columns in the source data must align with the table definition in the data warehouse.

If data is coming from multiple data stores, you can port the data into the data warehouse and store it in an integration table. Once data is in the integration table, you can use the power of data warehouse to implement transformation operations. Once the data is prepared, you can insert it into production tables.

Limitations

Warehouse supports many, but not all, of the table features offered by other databases.

The following list shows some of the table features that aren't currently supported.

  • 1024 maximum columns per table
  • Computed columns
  • Indexed views
  • Partitioned tables
  • Sequence
  • Sparse columns
  • Surrogate keys on number sequences with Identity columns
  • Synonyms
  • Temporary tables
  • Triggers
  • Unique indexes
  • User-defined types
  • External tables

Important

There are limitations with adding table constraints or columns when using Source Control with Warehouse.