ForecastingParameters Class

Reference

Manage parameters used by forecasting tasks.

Inheritance: builtins.object

ForecastingParameters

Constructor

ForecastingParameters(time_column_name: str | None = None, forecast_horizon: str | int = 1, time_series_id_column_names: str | List[str] | None = None, group_column_names: str | List[str] | None = None, target_lags: List[int] | int | str | None = None, feature_lags: str | None = None, target_rolling_window_size: str | int | None = None, holiday_country: str | None = None, seasonality: str | int | None = 'auto', country_or_region_for_holidays: str | None = None, use_stl: str | None = None, short_series_handling: bool = True, short_series_handling_configuration: str | None = 'auto', freq: str | None = None, target_aggregation_function: str | None = None, cv_step_size: str | int | None = 'auto', features_unknown_at_forecast_time: str | List[str] | None = None, validate_parameters: bool = True, _enable_future_regressors: bool = False, **kwargs: Any)

Parameters

Name	Description
time_column_name	str The name of the time column. This parameter is required when forecasting to specify the datetime column in the input data used for building the time series and inferring its frequency. Default value: None
forecast_horizon	int or str The desired maximum forecast horizon in units of time-series frequency. The default value is 1. Units are based on the time interval of your training data, e.g., monthly, weekly that the forecaster should predict out. When task type is forecasting, this parameter is required. For more information on setting forecasting parameters, see Auto-train a time-series forecast model. Default value: 1
time_series_id_column_names	str or list(str) The names of columns used to group a timeseries. It can be used to create multiple series. If time series id column names is not defined or the identifier columns specified do not identify all the series in the dataset, the time series identifiers will be automatically created for your dataset. Default value: None
group_column_names	str or list(str) Default value: None
target_lags	int, str or list(int) The number of past periods to lag from the target column. By default the lags are turned off. When forecasting, this parameter represents the number of rows to lag the target values based on the frequency of the data. This is represented as a list or single integer. Lag should be used when the relationship between the independent variables and dependent variable do not match up or correlate by default. For example, when trying to forecast demand for a product, the demand in any month may depend on the price of specific commodities 3 months prior. In this example, you may want to lag the target (demand) negatively by 3 months so that the model is training on the correct relationship. For more information, see Auto-train a time-series forecast model. Note on auto detection of target lags and rolling window size. Please see the corresponding comments in the rolling window section. We use the next algorithm to detect the optimal target lag and rolling window size. Estimate the maximum lag order for the look back feature selection. In our case it is the number of periods till the next date frequency granularity i.e. if frequency is daily, it will be a week (7), if it is a week, it will be month (4). That values multiplied by two is the largest possible values of lags/rolling windows. In our examples, we will consider the maximum lag order of 14 and 8 respectively). Create a de-seasonalized series by adding trend and residual components. This will be used in the next step. Estimate the PACF - Partial Auto Correlation Function on the on the data from (2) and search for points, where the auto correlation is significant i.e. its absolute value is more then 1.96/square_root(maximal lag value), which correspond to significance of 95%. If all points are significant, we consider it being strong seasonality and do not create look back features. We scan the PACF values from the beginning and the value before the first insignificant auto correlation will designate the lag. If first significant element (value correlate with itself) is followed by insignificant, the lag will be 0 and we will not use look back features. Default value: None
feature_lags	str or None Flag for generating lags for the numeric features with 'auto' or None. Default value: None
target_rolling_window_size	int, str or None The number of past periods used to create a rolling window average of the target column. When forecasting, this parameter represents n historical periods to use to generate forecasted values, <= training set size. If omitted, n is the full training set size. Specify this parameter when you only want to consider a certain amount of history when training the model. If set to 'auto', rolling window will be estimated as the last value where the PACF is more then the significance threshold. Please see target_lags section for details. Default value: None
holiday_country	str or None Default value: None
seasonality	int, str or None Set time series seasonality as an integer multiple of the series frequency. If seasonality is set to 'auto', it will be inferred. If set to None, the time series is assumed non-seasonal which is equivalent to seasonality=1. Default value: auto
country_or_region_for_holidays	str or None The country/region used to generate holiday features. These should be ISO 3166 two-letter country/region codes, for example 'US' or 'GB'. Default value: None
use_stl	str or None Configure STL Decomposition of the time-series target column. use_stl can take three values: None (default) - no stl decomposition, 'season' - only generate season component and season_trend - generate both season and trend components. Default value: None
short_series_handling	bool Configure short series handling for forecasting tasks. Default value: True
short_series_handling_configuration	str or None The parameter defining how if AutoML should handle short time series. Possible values: 'auto' (default), 'pad', 'drop' and None. auto short series will be padded if there are no long series, otherwise short series will be dropped. pad all the short series will be padded. drop all the short series will be dropped". None the short series will not be modified. If set to 'pad', the table will be padded with the zeroes and empty values for the regressors and random values for target with the mean equal to target value median for given time series id. If median is more or equal to zero, the minimal padded value will be clipped by zero. Input: Date numeric_value string target 2020-01-01 23 green 55 Output assuming minimal number of values is four: Date numeric_value string target 2019-12-29 0 NA 55.1 2019-12-30 0 NA 55.6 2019-12-31 0 NA 54.5 2020-01-01 23 green 55 Note: We have two parameters short_series_handling_configuration and legacy short_series_handling. When both parameters are set we are synchronize them as shown in the table below (short_series_handling_configuration and short_series_handling for brevity are marked as handling_configuration and handling respectively). handling handling_configuration resulting handling resulting handling_configuration True auto True auto True pad True auto True drop True auto True None False None False auto False None False pad False None False drop False None False None False None Default value: auto
freq	str or None Forecast frequency. When forecasting, this parameter represents the period with which the forecast is desired, for example daily, weekly, yearly, etc. The forecast frequency is dataset frequency by default. You can optionally set it to greater (but not lesser) than dataset frequency. We'll aggregate the data and generate the results at forecast frequency. For example, for daily data, you can set the frequency to be daily, weekly or monthly, but not hourly. The frequency needs to be a pandas offset alias. Please refer to pandas documentation for more information: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects Default value: None
target_aggregation_function	str or None The function to be used to aggregate the time series target column to conform to a user specified frequency. If the target_aggregation_function is set, but the freq parameter is not set, the error is raised. The possible target aggregation functions are: "sum", "max", "min" and "mean". The target column values are aggregated based on the specified operation. Typically, sum is appropriate for most scenarios. Numerical predictor columns in your data are aggregated by sum, mean, minimum value, and maximum value. As a result, automated ML generates new columns suffixed with the aggregation function name and applies the selected aggregate operation. For categorical predictor columns, the data is aggregated by mode, the most prominent category in the window. Date predictor columns are aggregated by minimum value, maximum value and mode. freq target_aggregation_function Data regularity fixing mechanism None (Default) None (Default) The aggregation is not applied.If the valid frequency can not bedetermined the error will be raised. Some Value None (Default) The aggregation is not applied.If the number of data points compliantto given frequency grid is less then 90%these points will be removed, otherwisethe error will be raised. None (Default) Aggregation function The error about missing frequency parameteris raised. Some Value Aggregation function Aggregate to frequency using providedaggregation function. Default value: None
cv_step_size	str, int or None Number of periods between the origin_time of one CV fold and the next fold. For example, if n_step = 3 for daily data, the origin time for each fold will be three days apart. Default value: auto
validate_parameters	bool Configure to validate input parameters. Default value: True
features_unknown_at_forecast_time	Default value: None
_enable_future_regressors	Default value: False

Methods

from_parameters_dict	Construct ForecastingParameters class from a dict.
validate_parameters	Validate the parameters in the ForecastingParameters class.

from_parameters_dict

Construct ForecastingParameters class from a dict.

static from_parameters_dict(parameter_dict: Dict[str, Any], validate_params: bool, show_deprecate_warnings: bool | None = True) -> ForecastingParameters

Parameters

Name	Description
parameter_dict Required	The dict contains all the forecasting parameters.
validate_params Required	Whether validate input parameter or not.
show_deprecate_warnings	Switch to show deprecated parameters warning. Default value: True

validate_parameters

Validate the parameters in the ForecastingParameters class.

validate_parameters()

Attributes

country_or_region_for_holidays

The country/region used to generate holiday features. These should be ISO 3166 two-letter country/region code, for example 'US' or 'GB'.

cv_step_size

Number of periods between the origin_time of one CV fold and the next fold. For example, if n_step = 3 for daily data, the origin time for each fold will be three days apart.

drop_column_names

The names of columns to drop for forecasting tasks.

dropna

Configure dropna in timeseries data transformer.

feature_lags

Flag for generating lags for the numeric features.

features_unknown_at_forecast_time

The column name(s) of features that are available for training but unknown at forecast/inference time. If this is not defined, it is assumed that all the feature columns are known at forecast time.

forecast_horizon

The desired maximum forecast horizon in units of time-series frequency. The default value is 1. Units are based on the time interval of your training data, e.g., monthly, weekly that the forecaster should predict out.

formatted_drop_column_names

The formatted names of columns to drop for forecasting tasks.

formatted_group_column_names

formatted_target_lags

The formatted number of past periods to lag from the target column.

formatted_time_series_id_column_names

The names of columns used to group a timeseries. It can be used to create multiple series. If time_series_id_column_names is not defined, the data set is assumed to be one time-series.

formatted_unknown_features

The column name(s) of features that are available for training but unknown at forecast/inference time. If this is not defined, it is assumed that all the feature columns are known at forecast time. Only supported in dnn/tcn. When user not specifying anything, future features are not enabled in dnn. However, if they provide an empty list, future features are enabled, and all feature columns are assumed to be known at forecast time.

freq

The frequency of the data set.

group_column_names

holiday_country

The country/region used to generate holiday features. These should be ISO 3166 two-letter country/region code, for example 'US' or 'GB'.

overwrite_columns

Configure overwrite_columns in timeseries data transformer.

seasonality

Time series seasonality as an integer multiple of the series frequency.

short_series_handling_configuration

Return if short grain should be padded.

target_aggregation_function

Return the target aggregation function.

target_lags

The number of past periods to lag from the target column.

target_rolling_window_size

time_column_name

The name of the time column. This parameter is required when forecasting to specify the datetime column in the input data used for building the time series and inferring its frequency.

time_series_id_column_names

The names of columns used to group a timeseries. It can be used to create multiple series. If time_series_id_column_names is not defined, the data set is assumed to be one time-series.

transform_dictionary

Configure transform_dictionary in timeseries data transformer.

use_stl

Configure STL Decomposition of the time-series target column. use_stl can take three values: None (default) - no stl decomposition, 'season' - only generate season component and season_trend - generate both season and trend components.

DEFAULT_TIMESERIES_VALUE

DEFAULT_TIMESERIES_VALUE = {'_enable_future_regressors': False, 'cv_step_size': 'auto', 'feature_lags': None, 'features_unknown_at_forecast_time': None, 'forecast_horizon': 1, 'freq': None, 'max_horizon': 1, 'seasonality': 'auto', 'short_series_handling': True, 'short_series_handling_configuration': 'auto', 'target_aggregation_function': None, 'target_lags': None, 'target_rolling_window_size': None, 'use_stl': None}

DEPRECATED_DICT

DEPRECATED_DICT = {'country': 'country_or_region_for_holidays', 'country_or_region': 'country_or_region_for_holidays', 'grain_column_names': 'time_series_id_column_names', 'holiday_country': 'country_or_region_for_holidays', 'max_horizon': 'forecast_horizon'}

EMPTY_TIME_COLUMN_NAME

EMPTY_TIME_COLUMN_NAME = '_EMPTY_TIME_COLUMN_NAME'

MAX_LAG_LENGTH

MAX_LAG_LENGTH = 2000

Share via

ForecastingParameters Class

Constructor

Parameters

Methods

from_parameters_dict

Parameters

validate_parameters

Attributes

country_or_region_for_holidays

cv_step_size

drop_column_names

dropna

feature_lags

features_unknown_at_forecast_time

forecast_horizon

formatted_drop_column_names

formatted_group_column_names

formatted_target_lags

formatted_time_series_id_column_names

formatted_unknown_features

freq

group_column_names

holiday_country

overwrite_columns

seasonality

short_series_handling_configuration

target_aggregation_function

target_lags

target_rolling_window_size

time_column_name

time_series_id_column_names

transform_dictionary

use_stl

DEFAULT_TIMESERIES_VALUE

DEPRECATED_DICT

EMPTY_TIME_COLUMN_NAME

MAX_LAG_LENGTH

Feedback

Additional resources