Attributes Table
This table indicates the properties of each attribute used in the model, including the distribution type for the attribute, and whether the attribute is used to build the model or is to be predicted by the model. Entries in the attributes table are not required unless the default behavior needs to be overridden. The default behavior is for each attribute to be used for the prediction as well as to be predicted. The default distribution is Autodetect. A blank attributes table is created if one is not specified.
Note
- There is no table named Attributes. Each model configuration specifies the name of its attribute table as an entry in the PredictorDataTables table.
Column Name | Type | Description | Required? |
PropID | DBTYPE_UI4 | Unique key – Property ID. | Yes. |
ParentID | DBTYPE_UI4 | ID of parent (0 = root). | Yes. |
Name | DBTYPE_WSTR | Name of attribute, hierarchy element, or Pivot Column. | Yes. |
TableName | DBTYPE_WSTR | Not used. | No. |
ColumnName | DBTYPE_WSTR | Not used. | No. |
Distribution | DBTYPE_UI2 | Distribution type
B - Model As Binary NB - Not Model As Binary Valid values are:
|
No.
NULL implies Autodetect. Must be NULL if row describes a hierarchy element. |
Predict | DBTYPE_BOOL | Indicates whether this is an output property.
Valid values are:
|
No.
NULL implies True. Must be NULL if row describes a hierarchy element. |
UseToPredict | DBTYPE_BOOL | Indicates whether this an input property.
Valid values are:
|
No.
NULL implies True. Must be NULL if row describes a hierarchy element. |
Distribution Attribute
Discrete vs. Continuous. Discrete means that only certain data values are legal and there is no specific relationship between sequential values. For example, the two-letter state abbreviations are discrete. All non-numeric attributes are treated as discrete. Numerical attributes may or may not be discrete.
Normal vs. LogNormal. Certain continuous data, such as Income, is better represented as a Normal (Gaussian) distribution while other data, such as the number of products purchased, better fit a LogNormal distribution which is skewed towards 0. An attribute with a LogNormal distribution means the logarithm of the attribute has a Normal distribution.
Autodetect. An algorithm to auto-detect the above properties can be used. For discrete attributes, the algorithm also eliminates those attributes that have only one distinct value or too many distinct values to be useful (over 500). Note that "missing" counts as a distinct value.
Model As Binary. For a property that has the Model As Binary attribute, the model is only concerned with whether the property exists, not the value of the property. For example, consider the case of an attribute that corresponds to the number of products purchased, it may be more useful to model whether the product was purchased or not, rather than modeling the quantity purchased.