Currently, I have an ML pipeline built in Azure ML Designer that effectively does the following basic steps:
- Trains a model across a full quantitative range of [0.0, 7.0] from a full baseline dataset.
- Scores model to generate predictions appended to actuals in a scored dataset.
- Passes the scored dataset to R-scripts that row filter the scored dataset based on criteria that reduce the full quantitative range of [0.0, 7.0] into 3 segmented quantitative ranges of range 1 = [0,1.0], range 2 = (1.0, 3.3], and range 3 = (3.3, 7.0].
- Trains 1 new model from each of 3 filtered datasets from step 3 across the corresponding segmented range using same predictors from step 1 plus 1 new predictor: the predictions generated in step 2.
- Scores each model from step 4 to generate predictions appended to actuals in scored datasets.
The idea here is that the predictions generated in step 2 are actually treated as an estimate, which is then used as an engineered feature in subsequent trainings to refine the predictions. I've provided the schematic of this pipeline in Designer.PNG.
Now, before I built this in Designer, I used Azure's AutoML service to do all this manually as sort of a POC that this idea would work. And those results were very promising. But passing, filtering, splitting, etc. all the data is difficult to do manually and I would like a pipeline to do it for me.
The problem? Ideally, I need ensemble models for each of the 4 trained models and I'm not seeing a way to get ensemble models out of Azure ML Designer.
Is there a way to utilize AutoML within designer? Or is there a way to eliminate the regression/train components from my Designer pipeline and replace with an already trained model stored in our workspace?