Indicator Class
Create a new column indicating if the input has missing values.
- Inheritance
-
nimbusml.internal.core.preprocessing.missing_values._indicator.IndicatorIndicatornimbusml.base_transform.BaseTransformIndicatorsklearn.base.TransformerMixinIndicator
Constructor
Indicator(columns=None, **params)
Parameters
- columns
a dictionary of key-value pairs, where key is the output column name and value is the input column name.
Multiple key-value pairs are allowed.
Input column type:
Output column type:
If the output column names are same as the input column names, then
simply specify columns
as a list of strings.
The << operator can be used to set this value (see Column Operator)
For example
Indicator(columns={'out1':'input1', 'out2':'input2'})
Indicator() << {'out1':'input1', 'out2':'input2'}
For more details see Columns.
- params
Additional arguments sent to compute engine.
Examples
###############################################################################
# Indicator
import numpy as np
import pandas as pd
from nimbusml import FileDataStream
from nimbusml.preprocessing.missing_values import Indicator
with_nans = pd.DataFrame(
data=dict(
Sepal_Length=[2.5, np.nan, 2.1, 1.0],
Sepal_Width=[.75, .9, .8, .76],
Petal_Length=[np.nan, 2.5, 2.6, 2.4],
Petal_Width=[.8, .7, .9, 0.7],
Species=["setosa", "viginica", "", 'versicolor']))
# write NaNs to file to show how this transform work
tmpfile = 'tmpfile_with_nans.csv'
with_nans.to_csv(tmpfile, index=False)
data = FileDataStream.read_csv(tmpfile, sep=',', numeric_dtype=np.float32)
# transform usage
xf = Indicator(columns={'PL': 'Petal_Length', 'SL': 'Sepal_Length'})
# fit and transform
features = xf.fit_transform(data)
# print features
print(features.head())
# PL Petal_Length Petal_Width SL ... Sepal_Width Species
# 0 True NaN 0.8 False ... 0.75 setosa
# 1 False 2.5 0.7 True ... 0.90 viginica
# 2 False 2.6 0.9 False ... 0.80 None
# 3 False 2.4 0.7 False ... 0.76 versicolor
Remarks
Indicator
creates a new column containing indicator values
("True" or "False") of which rows have missing values.
Methods
get_params |
Get the parameters for this operator. |
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
- deep