MimicWrapper Class

Reference

A wrapper explainer which reduces the number of function calls necessary to use the explain model package.

Initialize the MimicWrapper.

`<<that accepts a 2d ndarray :param explainable_model: The uninitialized surrogate model used to explain the black box model.

Also known as the student model.

Inheritance: azureml._logging.chained_identity.ChainedIdentity

MimicWrapper

Constructor

MimicWrapper(workspace, model, explainable_model, explainer_kwargs=None, init_dataset=None, run=None, features=None, classes=None, model_task=ModelTask.Unknown, explain_subset=None, transformations=None, feature_maps=None, allow_all_transformations=None)

Parameters

Name	Description
workspace Required	Workspace The workspace object where the Models and Datasets are defined.
model Required	str or <xref:<xref:model that implements sklearn.predict>()> or <xref:sklearn.predict_proba>() or <xref:<xref:pipeline function that accepts a 2d ndarray>> The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray.
explainable_model Required	BaseExplainableModel The uninitialized surrogate model used to explain the black box model. Also known as the student model.
explainer_kwargs	dict Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized. default value: None
init_dataset	str or array or DataFrame or csr_matrix The dataset ID or regular dataset used for initializing the explainer (e.g., x_train). default value: None
run	Run The run this explanation should be associated with. default value: None
features	list[str] A list of feature names. default value: None
classes	list[str] Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier. default value: None
model_task	str Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array. default value: ModelTask.Unknown
explain_subset	list[int] A list of feature indices. If specified, Azure only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and you already know the set of interesting features. The subset can be the top-k features from the model summary. This parameter is not supported when transformations are set. default value: None
transformations	ColumnTransformer or list[tuple] A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas. If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler. Examples for transformations that work: `[ (["col1", "col2"], sklearn_one_hot_encoder), (["col3"], None) #col3 passes as is ] [ (["col1"], my_own_transformer), (["col2"], my_own_transformer), ]` An example of a transformation that would raise an error since it cannot be interpreted as one to many: `[ (["col1", "col2"], my_own_transformer) ]` The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. default value: None
feature_maps	list[array] or list[csr_matrix] A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception. default value: None
allow_all_transformations	bool Whether to allow many to many and many to one transformations. default value: None
workspace Required	Workspace The workspace object where the Models and Datasets are defined.
model Required	str or <xref:<xref:model that implements sklearn.predict>()> or <xref:sklearn.predict_proba>() or <xref:<xref:pipeline function>> The model ID of a model registered to MMS or a regular machine learning model or pipeline to explain. If a model is specified, it must implement sklearn.predict() or sklearn.predict_proba(). If a pipeline is specified, it must include a function that accepts a 2d ndarray.
explainer_kwargs Required	dict Any keyword arguments that go with the chosen explainer not otherwise covered here. They will be passed in as kwargs when the underlying explainer is initialized.
init_dataset Required	str or array or DataFrame or csr_matrix The dataset ID or regular dataset used for initializing the explainer (e.g. x_train).
run Required	Run The run this explanation should be associated with.
features Required	list[str] A list of feature names.
classes Required	list[str] Class names as a list of strings. The order of the class names should match that of the model output. Only required if explaining classifier.
model_task Required	str Optional parameter to specify whether the model is a classification or regression model. In most cases, the type of the model can be inferred based on the shape of the output, where a classifier has a predict_proba method and outputs a 2 dimensional array, while a regressor has a predict method and outputs a 1 dimensional array.
explain_subset Required	list[int] List of feature indices. If specified, only selects a subset of the features in the evaluation dataset for explanation, which will speed up the explanation process when number of features is large and the user already knows the set of interested features. The subset can be the top-k features from the model summary. This argument is not supported when transformations are set.
transformations Required	ColumnTransformer or list[tuple] A sklearn.compose.ColumnTransformer or a list of tuples describing the column name and transformer. When transformations are provided, explanations are of the features before the transformation. The format for a list of transformations is same as the one here: https://github.com/scikit-learn-contrib/sklearn-pandas. If you are using a transformation that is not in the list of sklearn.preprocessing transformations that are supported by the interpret-community package, then this parameter cannot take a list of more than one column as input for the transformation. You can use the following sklearn.preprocessing transformations with a list of columns since these are already one to many or one to one: Binarizer, KBinsDiscretizer, KernelCenterer, LabelEncoder, MaxAbsScaler, MinMaxScaler, Normalizer, OneHotEncoder, OrdinalEncoder, PowerTransformer, QuantileTransformer, RobustScaler, StandardScaler. Examples for transformations that work: `[ (["col1", "col2"], sklearn_one_hot_encoder), (["col3"], None) #col3 passes as is ] [ (["col1"], my_own_transformer), (["col2"], my_own_transformer), ]` An example of a transformation that would raise an error since it cannot be interpreted as one to many: `[ (["col1", "col2"], my_own_transformer) ]` The last example would not work since the interpret-community package can't determine whether my_own_transformer gives a many to many or one to many mapping when taking a sequence of columns. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.
feature_maps Required	list[array] or list[csr_matrix] <xref::param allow_all_transformations: Whether to allow many to many and many to one transformations.> A list of feature maps from raw to generated feature. This parameter can be list of numpy arrays or sparse matrices where each array entry (raw_index, generated_index) is the weight for each raw, generated feature pair. The other entries are set to zero. For a sequence of transformations [t1, t2, ..., tn] generating generated features from raw features, the list of feature maps correspond to the raw to generated maps in the same order as t1, t2, etc. If the overall raw to generated feature map from t1 to tn is available, then just that feature map in a single element list can be passed. Only one parameter from 'transformations' or 'feature_maps' should be specified to generate raw explanations. Specifying both will result in configuration exception.

Remarks

The MimicWrapper can be used for explaining machine learning models, and is particularly effective in conjunction with AutoML. For example, using the automl_setup_model_explanations function in the <xref:azureml.train.automl.runtime.automl_explain_utilities> module, you can use the MimicWrapper to compute and visualize feature importance. For more information, see Interpretability: model explanations in automated machine learning.

In the following example, the MimicWrapper is used in a classification problem.


   from azureml.interpret.mimic_wrapper import MimicWrapper
   explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,
                explainable_model=automl_explainer_setup_obj.surrogate_model,
                init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,
                features=automl_explainer_setup_obj.engineered_feature_names,
                feature_maps=[automl_explainer_setup_obj.feature_map],
                classes=automl_explainer_setup_obj.classes,
                explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)

For more information about this example, see this notebook.

Methods

explain

Explain a model's behavior and optionally upload that explanation for storage and visualization.

explain

Explain a model's behavior and optionally upload that explanation for storage and visualization.

explain(explanation_types, eval_dataset=None, top_k=None, upload=True, upload_datasets=False, tag='', get_raw=False, raw_feature_names=None, experiment_name='explain_model', raw_eval_dataset=None, true_ys=None)

Parameters

Name	Description
explanation_types Required	list[str] A list of strings representing types of explanations desired. Currently, 'global' and 'local' are supported. Both may be passed in at once; only one explanation will be returned.
eval_dataset	str or array or DataFrame or csr_matrix The dataset ID or regular dataset used to generate the explanation. default value: None
top_k	int Limit to the amount of data returned and stored in Run History to top k features, when possible. default value: None
upload	bool If True, the explanation is automatically uploaded to Run History for storage and visualization. If a run was not passed in at initialization, one is created. default value: True
upload_datasets	bool If set to True and no dataset IDs are passed in, the evaluation dataset will be uploaded to Azure storage. This will improve the visualization available in the web view. default value: False
tag Required	str A string to attach to the explanation to distinguish it from others after upload.
get_raw	bool If True and the parameter `feature_maps` was passed in during initialization, the explanation returned will be for the raw features. If False or not specified, the explanation will be for the data exactly as it is passed in. default value: False
raw_feature_names	list[str] The list of raw feature names, replacing engineered feature names from the constructor. default value: None
experiment_name	str The desired name to give an explanation if `upload` is True but no run was passed in during initialization default value: explain_model
raw_eval_dataset	str or array or DataFrame or csr_matrix Raw eval data to be uploaded for raw explanations. default value: None
true_ys	list \| <xref:pandas.Dataframe> \| ndarray The true labels for the evaluation examples. default value: None

Returns

Type	Description
Explanation	An explanation object.

Attributes

explainer

Get the explainer that is being used internally by the wrapper.

Returns

Type	Description
MimicExplainer	The explainer that is being used internally by the wrapper.

MimicWrapper Class

Constructor

Parameters

Remarks

Methods

explain

Parameters

Returns

Attributes

explainer

Returns

Feedback

Feedback

Additional resources