Data Types (Data Mining)
Applies to: SQL Server 2019 and earlier Analysis Services Azure Analysis Services Fabric/Power BI Premium
Important
Data mining was deprecated in SQL Server 2017 Analysis Services and now discontinued in SQL Server 2022 Analysis Services. Documentation is not updated for deprecated and discontinued features. To learn more, see Analysis Services backward compatibility.
When you create a mining model or a mining structure in Microsoft SQL Server SQL Server Analysis Services, you must define the data types for each of the columns in the mining structure. The data type tells the analysis engine whether the data in the data source is numerical or text, and how the data should be processed. For example, if your source data contains numerical data, you can specify whether the numbers be treated as integers or by using decimal places.
SQL Server Analysis Services supports the following data types for mining structure columns:
Data Type | Supported Content Types |
---|---|
Text | Cyclical, Discrete, Discretized, Key Sequence, Ordered, Sequence |
Long | Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered, Sequence, Time Classified |
Boolean | Cyclical, Discrete, Ordered |
Double | Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered, Sequence, Time Classified |
Date | Continuous, Cyclical, Discrete, Discretized, Key, Key Sequence, Key Time, Ordered |
Note
The Time and Sequence content types are only supported by third-party algorithms. The Cyclical and Ordered content types are supported, but most algorithms treat them as discrete values and do not perform special processing.
The table also shows the content types supported for each data type.
The content type is specific to data mining and lets you customize the way that data is processed or calculated in the mining model. For example, even if your column contains numbers, you might need to model them as discrete values. If the column contains numbers, you can also specify that they be binned, or discretized, or specify that the model handle them as continuous values. Thus, the content type can have a huge effect on the model.. For a list of all the content types, see Content Types (Data Mining).
Note
In other machine learning systems, you might encounter the terms nominal data, factors or categories, ordinal data, or sequence data. In general, these correspond to content types. In SQL Server, the data type specifies only the value type for storage, not its usage in the model.
Specifying a Data Type
If you create the mining model directly by using Data Mining Extensions (DMX), you can define the data type for each column as you define the model, and Analysis Services will create the corresponding mining structure with the specified data types at the same time. If you create the mining model or mining structure by using a wizard, Analysis Services will suggest a data type, or you can choose a data type from a list.
Changing a Data Type
If you change the data type of a column, you must always reprocess the mining structure and any mining models that are based on that structure. Sometimes if you change the data type, that column can no longer be used in a particular model. In that case, Analysis Services will either raise an error when you reprocess the model, or will process the model but leave out that particular column.
See Also
Content Types (Data Mining)
Content Types (DMX)
Data Mining Algorithms (Analysis Services - Data Mining)
Mining Structures (Analysis Services - Data Mining)
Data Types (DMX)
Mining Model Columns
Mining Structure Columns