Test data for time sequential data
If I am trying to predict: the weather, the stock market, coffee sales per city, etc. there is no good way I can see to break out the data for training vs test data. For the weather case, training with Honolulu weather isn't going to do well testing with Denver weather.
Is it a good approach to train with the data for 5 years ago to 1 year ago. The test with the most recent year of data?
Or is there a better approach?
thanks - dave
@David Thielen Based on couple of experiments that I ran for predicting weather the dataset is based on various parameters of a certain location for a longer duration to get accurate results. The model should be able to predict irrespective of location if the data used is varied and the right sample period is chosen to train the model.
The data has been split around 70:30 to test and predict. I have seen the general guidance is to split data in a similar ratio.
Irrespective of Azure ML I have seen this sample accurately describe the weather forecasting scenario that could help.
Sign in to comment
Having lived in Honolulu and Denver there's no way that Honolulu weather (basically 80 degrees year round) can train a prediction for Denver(where a week after a snow storm it's shirt sleeve weather).