AutoML: problem with univariate time series forecasting

Anonymous
2021-04-06T20:01:53.07+00:00

I'm having troubles generating univariate time series forecasts with Azure Automated Machine Learning (I know...).

What I'm doing

So I have about 5 years worth of monthly observations in a dataframe that looks like this:

date target_value
2015-02-01 123
2015-03-01 456
2015-04-01 789
... ...

I want to forecast target_value based on past values of target_value, i.e. univariate forecasting like ARIMA for instance.
So I am setting up the AutoML forecast like this:

# that's the dataframe as shown above  
train_data = Dataset.Tabular.from_delimited_files(path=datastore.path(my_remote_filename))  
  
# ...other code...  
  
forecasting_parameters = ForecastingParameters(  
    time_column_name='date',  
    forecast_horizon=2,  
    target_lags='auto',  
    freq='MS'  
)  
  
automl_config = AutoMLConfig(task='forecasting',  
                             debug_log='automl_forecasting_function.log',  
                             primary_metric='normalized_root_mean_squared_error',  
                             enable_dnn=True,  
                             experiment_timeout_hours=8.0,  
                             enable_early_stopping=True,  
                             training_data=train_data,  
                             compute_target='my-cluster',  
                             n_cross_validations=3,  
                             verbosity=logging.INFO,  
                             max_concurrent_iterations=4,  
                             max_cores_per_iteration=-1,  
                             label_column_name='target_value',  
                             forecasting_parameters=forecasting_parameters)  

What the problem is

But AutoML does not seem to generate the forecast for target_value based on past values of target_value. It seems to use the date column as the independent variable!
The feature importance chart also shows date as the input feature:

84928-5ajgr.png

As a side note: running multivariate forecasts works fine.
When I use a dataset like this, feature_1 and feature_2 are used (i.e. as the X) to forecast target_value (i.e. the y)

date feature_1 feature_2 target_value
2015-02-01 10 7 123
2015-03-01 30 2 456
2015-04-01 20 5 789
... ... ... ...

My questions therefore
How do I need to set up a univariate AutoML forecast to forecast target_value based on past observations target_value?
I assumed generating lagged values for target_value etc. is exactly what AutoML is supposed to do.

Thanks!

Azure Machine Learning
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,836 Reputation points
    2021-04-08T04:12:36.817+00:00

    @Anonymous Thanks, AutoML does use the date column as an independent variable. We engineer several features from it, this is a standard practice for learning seasonal patterns. In the given scenario the date column will be featurized to represent 'day', 'month', 'day of week' etc. This is done to train regression-based model on this data, which will use the generated columns for prediction.

    Please remove the target_lags='auto' to allow selection of Arima. We have to block certain models (e.g. Arima) when the target lags are set. This is a product gap that we're in the process of fixing.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.