SQL Server 2016 Temporal Data Assists Machine Learning Models
Microsoft is always seeking out ways to improve the customer experience and satisfaction. A project that is currently active looks at the SQL Server incidents reported to Microsoft SQL Server Support and applies Machine Learning. A specific aspect of the project is to predict when a case needs advanced assistance (escalation, onsite, development or upper level management assistance.)
Not every model requires historical data but working with our data scientists I realized the importance of temporal as it related to our approach. We are trying to predict on day 1, 2, 3, … that a issue has a high probability of requiring advanced assistance. While building the training set for Machine Learning it becomes clear that the model needs to understand what the issue(s) looked like on day 1, day 2, and so forth.
I made a quick chart in excel to help us all visualize the concept.
If I want to predict if an issue has a high probability of needing advanced assistance I need to know what an advanced issue looked like over time. If I take the training values when the incident was resolved, Machine Learning is limited to learning the resolution patterns.
Let me expound on this a bit more. If provide training data at day 10 to the machine learning model I am influencing the model accuracy at day 10. The model can be very accurate for day 10 but I want to predict issues that need assistance and address them on day 1.
Using a temporal approach the training data is expanded to each day in the life of the advanced incidents. The model now understands what an advanced issues looked like on day 1, 2, … allowing it to provide relevant predictions. When a case has high relevancy in the this model we can adjust resources and assist the customer quickly.
I prefer easy math so let’s assume 1000 issues needed advanced assistance over the past year and each of them took 10 days to resolve. Instead of a training set of 1000 issues at the point of resolution, applying a temporal design expands the training set to 1000 x 10 = 10,000 views.
When using Machine Learning carefully consider if SQL Server 2016 Temporal Tables are relevant to the accuracy and design of your model.
Bob Dorr - Principal SQL Server Escalation Engineer