ForecastingPipelineWrapperBase Class

Base class for forecast model wrapper.

Inheritance
ForecastingPipelineWrapperBase

Constructor

ForecastingPipelineWrapperBase(ts_transformer: TimeSeriesTransformer | None = None, y_transformer: Pipeline | None = None, metadata: Dict[str, Any] | None = None)

Parameters

ts_transformer
default value: None
y_transformer
default value: None
metadata
default value: None

Methods

align_output_to_input

Align the transformed output data frame to the input data frame.

Note: transformed will be modified by reference, no copy is being created. :param X_input: The input data frame. :param transformed: The data frame after transformation. :returns: The transfotmed data frame with its original index, but sorted as in X_input.

fit

Fit the model with input X and y.

forecast

Do the forecast on the data frame X_pred.

forecast_quantiles

Get the prediction and quantiles from the fitted pipeline.

is_grain_dropped

Return true if the grain is going to be dropped.

preaggregate_data_set

Aggregate the prediction data set.

Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.

preprocess_pred_X_y

Preprocess prediction X and y.

rolling_evaluation

" Produce forecasts on a rolling origin over the given test set.

Each iteration makes a forecast for the next 'max_horizon' periods with respect to the current origin, then advances the origin by the horizon time duration. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lag features.

This function returns a concatenated DataFrame of rolling forecasts joined with the actuals from the test set.

This method is deprecated and will be removed in a future release. Please use rolling_forecast() instead.

rolling_forecast

Produce forecasts on a rolling origin over a test set.

Each iteration makes a forecast of maximum horizon periods ahead using information up to the current origin, then advances the origin by 'step' time periods. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lookback features.

This function returns a DataFrame of rolling forecasts joined with the actuals from the test set. The columns in the returned data frame are as follows:

  • Timeseries ID columns (Optional). When supplied by the user, the given column names will be used.

  • Forecast origin column giving the origin time for each row.

    Column name: stored as the object member variable forecast_origin_column_name.

  • Time column. The column name given by the user will be used.

  • Forecast values column. Column name: stored as the object member forecast_column_name

  • Actual values column. Column name: stored as the object member actual_column_name

short_grain_handling

Return true if short or absent grains handling is enabled for the model.

static_preaggregate_data_set

Aggregate the prediction data set.

Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param ts_transformer: The timeseries tranformer used for training. :param time_column_name: name of the time column. :param grain_column_names: List of grain column names. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.

align_output_to_input

Align the transformed output data frame to the input data frame.

Note: transformed will be modified by reference, no copy is being created. :param X_input: The input data frame. :param transformed: The data frame after transformation. :returns: The transfotmed data frame with its original index, but sorted as in X_input.

align_output_to_input(X_input: DataFrame, transformed: DataFrame) -> DataFrame

Parameters

X_input
Required
transformed
Required

fit

Fit the model with input X and y.

fit(X: DataFrame, y: ndarray) -> ForecastingPipelineWrapperBase

Parameters

X
Required

Input X data.

y
Required

Input y data.

forecast

Do the forecast on the data frame X_pred.

forecast(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, forecast_destination: Timestamp | None = None, ignore_data_errors: bool = False) -> Tuple[ndarray, DataFrame]

Parameters

X_pred
default value: None

the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed.

y_pred
default value: None

the target value combining definite values for y_past and missing values for Y_future. If None the predictions will be made for every X_pred.

forecast_destination
<xref:pandas.Timestamp>
default value: None

Forecast_destination: a time-stamp value. Forecasts will be made all the way to the forecast_destination time, for all grains. Dictionary input { grain -> timestamp } will not be accepted. If forecast_destination is not given, it will be imputed as the last time occurring in X_pred for every grain.

ignore_data_errors
bool
default value: False

Ignore errors in user data.

Returns

Y_pred, with the subframe corresponding to Y_future filled in with the respective forecasts. Any missing values in Y_past will be filled by imputer.

Return type

forecast_quantiles

Get the prediction and quantiles from the fitted pipeline.

forecast_quantiles(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, quantiles: float | List[float] | None = None, forecast_destination: Timestamp | None = None, ignore_data_errors: bool = False) -> DataFrame

Parameters

X_pred
default value: None

the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed.

y_pred
default value: None

the target value combining definite values for y_past and missing values for Y_future. If None the predictions will be made for every X_pred.

quantiles
float or list of <xref:floats>
default value: None

The list of quantiles at which we want to forecast.

forecast_destination
<xref:pandas.Timestamp>
default value: None

Forecast_destination: a time-stamp value. Forecasts will be made all the way to the forecast_destination time, for all grains. Dictionary input { grain -> timestamp } will not be accepted. If forecast_destination is not given, it will be imputed as the last time occurring in X_pred for every grain.

ignore_data_errors
bool
default value: False

Ignore errors in user data.

Returns

A dataframe containing the columns and predictions made at requested quantiles.

is_grain_dropped

Return true if the grain is going to be dropped.

is_grain_dropped(grain: Tuple[str] | str | List[str]) -> bool

Parameters

grain
Required

The grain to test if it will be dropped.

Returns

True if the grain will be dropped.

preaggregate_data_set

Aggregate the prediction data set.

Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.

preaggregate_data_set(df: DataFrame, y: ndarray | None = None, is_training_set: bool = False) -> Tuple[DataFrame, ndarray | None]

Parameters

df
Required
y
default value: None
is_training_set
default value: False

preprocess_pred_X_y

Preprocess prediction X and y.

preprocess_pred_X_y(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, forecast_destination: Timestamp | None = None) -> Tuple[DataFrame, DataFrame | ndarray, Dict[str, Any]]

Parameters

X_pred
default value: None
y_pred
default value: None
forecast_destination
default value: None

rolling_evaluation

" Produce forecasts on a rolling origin over the given test set.

Each iteration makes a forecast for the next 'max_horizon' periods with respect to the current origin, then advances the origin by the horizon time duration. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lag features.

This function returns a concatenated DataFrame of rolling forecasts joined with the actuals from the test set.

This method is deprecated and will be removed in a future release. Please use rolling_forecast() instead.

rolling_evaluation(X_pred: DataFrame, y_pred: DataFrame | ndarray, ignore_data_errors: bool = False) -> Tuple[ndarray, DataFrame]

Parameters

X_pred
Required

the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed.

y_pred
Required

the target value corresponding to X_pred.

ignore_data_errors
default value: False

Ignore errors in user data.

Returns

Y_pred, with the subframe corresponding to Y_future filled in with the respective forecasts. Any missing values in Y_past will be filled by imputer.

Return type

rolling_forecast

Produce forecasts on a rolling origin over a test set.

Each iteration makes a forecast of maximum horizon periods ahead using information up to the current origin, then advances the origin by 'step' time periods. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lookback features.

This function returns a DataFrame of rolling forecasts joined with the actuals from the test set. The columns in the returned data frame are as follows:

  • Timeseries ID columns (Optional). When supplied by the user, the given column names will be used.

  • Forecast origin column giving the origin time for each row.

    Column name: stored as the object member variable forecast_origin_column_name.

  • Time column. The column name given by the user will be used.

  • Forecast values column. Column name: stored as the object member forecast_column_name

  • Actual values column. Column name: stored as the object member actual_column_name

rolling_forecast(X_pred: DataFrame, y_pred: ndarray, step: int = 1, ignore_data_errors: bool = False) -> DataFrame

Parameters

X_pred
<xref:pd.DataFrame>
Required

Prediction data frame

y_pred
<xref:np.ndarray>
Required

target values corresponding to rows in X_pred

step
int
default value: 1

Number of periods to advance the forecasting window in each iteration.

ignore_data_errors
bool
default value: False

Ignore errors in user data.

Returns

Data frame of rolling forecasts

Return type

<xref:pd.DataFrame>

short_grain_handling

Return true if short or absent grains handling is enabled for the model.

short_grain_handling() -> bool

static_preaggregate_data_set

Aggregate the prediction data set.

Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param ts_transformer: The timeseries tranformer used for training. :param time_column_name: name of the time column. :param grain_column_names: List of grain column names. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.

static static_preaggregate_data_set(ts_transformer: TimeSeriesTransformer, time_column_name: str, grain_column_names: List[str], df: DataFrame, y: ndarray | None = None, is_training_set: bool = False) -> Tuple[DataFrame, ndarray | None]

Parameters

ts_transformer
Required
time_column_name
Required
grain_column_names
Required
df
Required
y
default value: None
is_training_set
default value: False

Attributes

actual_column_name

forecast_column_name

forecast_origin_column_name

grain_column_list

max_horizon

Return max hiorizon used in the model.

origin_col_name

Return the origin column name.

target_lags

Return target lags if any.

target_rolling_window_size

Return the size of rolling window.

time_column_name

Return the name of the time column.

user_target_column_name

y_max_dict

Return the dictionary with maximal target values by time series ID

y_min_dict

Return the dictionary with minimal target values by time series ID

FATAL_NO_TARGET_IMPUTER

FATAL_NO_TARGET_IMPUTER = 'No target imputers were found in TimeSeriesTransformer.'

FATAL_NO_TS_TRANSFORM

FATAL_NO_TS_TRANSFORM = 'The time series transform is absent. Please try training model again.'