Scenario:
I have 60 data factories processing files every day, and sometimes an activity run in these data factories may fail. Therefore, I want to create a monitoring system that detects these failures and triggers a notification in Slack.
Solution 1:
Create a Log Workspace, send the ActivityRun logs from all data factories to this log workspace. Create an alert based on a query with a frequency of 5 minutes. The query filters lines from the last 5 minutes. Additionally, this alert includes dimensions such as RunError, RunErrorUrl, and Parameters. If the query results in rows, it triggers a group of actions, including a webhook to a Logic App. This Logic App uses the passed dimensions to construct a user-friendly Slack message, for example:
RunError: dimensions.RunError Ephemeral: dimensions.RunErrorUrl
Solution 2:
Create a Log Workspace, send the ActivityRun logs from all data factories to this log workspace. Create a Logic App with a first step of recurring execution every 5 minutes. Afterward, it queries the Log Workspace to find lines from the last 5 minutes and then creates a Slack message based on this data.
Which solution would be more efficient and cost-effective? Thank you in advance!