summary.mlModel：Microsoft R 機器學習模型摘要。

發行項
05/04/2023

Microsoft R 機器學習模型摘要。

使用方式

 ## S3 method for class `mlModel':
summary  (object, top = 20, ...)

引數

`object`

從 MicrosoftML 分析傳回的模型物件。

`top`

指定要在線性模型摘要中顯示的 top 係數計數，例如 rxLogisticRegression 和 rxFastLinear。偏差會先顯示，接著是其他權數，並會按絕對值以遞減順序排列。若設為 NULL，則會顯示所有非零係數。否則只會顯示前幾個 top 係數。

`...`

要傳遞至摘要方法的其他引數。

詳細資料

提供原始函數呼叫、
用於訓練模型之資料集、以及模型中係數統計資料的相關摘要資訊。

值

MicrosoftML 分析物件的 summary 方法會傳回一份清單，其中包含原始函數呼叫及使用的基礎參數。 coef 方法會傳回權數的具名向量，並處理模型物件的資訊。

針對 rxLogisticRegression，當 showTrainingStats 設為 TRUE 時，摘要中可能也會顯示下列統計資料。

`training.size`

用於訓練模型的資料集大小，以資料列計數表示。

`deviance`

模型偏差會由 -2 * ln(L) 提供，其中 L 是在模型納入所有特徵的情況下，取得觀察的可能性。

`null.deviance`

null 偏差會由 -2 * ln(L0) 提供，其中 L0 是在未受到特徵影響的情況下，取得觀察的可能性。若模型中存在偏差，則 null 模型會包含偏差。

`aic`

AIC (Akaike 資訊準則) 被定義為 2 * k ``+ deviance，其中 k 為模型的係數數目。偏差會算為係數之一。 AIC 為模型相對品質的測量方法，可處理模型適合度之間的權衡 (測量依據為偏差)，以及模型複雜度 (測量依據為係數數目)。

`coefficients.stats`

此為資料框架，其中包含模型中各項係數的統計資料。每項係數都會顯示下列統計資料。偏差會顯示在第一個資料列，而其餘係數則會按 p-value 以遞增順序排列。

Estimate：估計的模型係數值。
Std Error：此為係數估計值的大型樣本變異數平方根。
z-Score：我們可以計算估計值與標準誤之間的比率，以針對虛無假設 (其中陳述係數應為零) 進行測試，這會與係數的顯著性 (而得) 相關。在虛無假設中，若未套用正規化，相關係數的估計值會遵循常態分布，其中平均數為 0 且標準差等於上方計算出的標準誤。 z-score 會輸出係數估計值和標準誤之間的比率。
Pr(>|z|)：此為對應 z-score 雙邊測試的 p-value。顯著性指標會根據顯著性水準附加至 p-value。若 F(x) 為標準常態分布 N(0, 1) 的 CDF，則 P(>|z|) = 2 - ``2 * F(|z|)。

作者

Microsoft Corporation Microsoft Technical Support

另請參閱

rxFastTrees、rxFastForest、rxFastLinear、rxOneClassSvm、rxNeuralNet、rxLogisticRegression。

範例


 # Estimate a logistic regression model
 logitModel <- rxLogisticRegression(isCase ~ age + parity + education + spontaneous + induced,
                   transforms = list(isCase = case == 1),
                   data = infert)
 # Print a summary of the model
 summary(logitModel)

 # Score to a data frame
 scoreDF <- rxPredict(logitModel, data = infert, 
     extraVarsToWrite = "isCase")

 # Compute and plot the Radio Operator Curve and AUC
 roc1 <- rxRoc(actualVarName = "isCase", predVarNames = "Probability", data = scoreDF) 
 plot(roc1)
 rxAuc(roc1)

 #######################################################################################
 # Multi-class logistic regression  
 testObs <- rnorm(nrow(iris)) > 0
 testIris <- iris[testObs,]
 trainIris <- iris[!testObs,]
 multiLogit <- rxLogisticRegression(
     formula = Species~Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
     type = "multiClass", data = trainIris)

 # Score the model
 scoreMultiDF <- rxPredict(multiLogit, data = testIris, 
     extraVarsToWrite = "Species")    
 # Print the first rows of the data frame with scores
 head(scoreMultiDF)
 # Look at confusion matrix
 table(scoreMultiDF$Species, scoreMultiDF$PredictedLabel)

 # Look at the observations with incorrect predictions
 badPrediction = scoreMultiDF$Species != scoreMultiDF$PredictedLabel
 scoreMultiDF[badPrediction,]