Automate Jupyter Notebooks for diagnostics

Logic Apps
Automation
Azure DevOps
Functions

This article applies to businesses or teams that want to reduce their manual processes, also known as toil, in troubleshooting or diagnostics. Specifically, this solution shows how to write troubleshooting guides or diagnostic steps in Jupyter Notebooks that you can reuse, test, and automate.

Beyond troubleshooting and diagnostics, you can apply this methodology to routine scenarios that benefit from automation but that sometimes require manual execution. Examples include backing up and restoring a database or creating a tutorial on how to diagnose application health issues.

You can use Azure Logic Apps and Azure Automation to automate the troubleshooting guides, diagnostic steps, or other tasks that you have in Jupyter Notebooks. You can create, edit, and test Jupyter Notebooks manually by using your favorite client tools, like Visual Studio Code and Azure Data Studio.

Potential use cases

  • Automate routine tasks like data extraction, transformation, or loading.
  • Trigger mitigation steps for common problems.

Architecture

Diagram that shows how to automate diagnostic notebooks by using an Azure serverless architecture.

Download a PowerPoint file of this architecture.

Workflow

This scenario covers a diagnostic/troubleshooting development and operations flow at a high level.

  • Team members use Azure Data Studio to write, view, and run the diagnostic or troubleshooting notebooks in Jupyter Notebook format. The notebooks include code for troubleshooting issues and descriptions that explain the troubleshooting steps. The notebook author can write the code in languages like Python, PowerShell, or .NET Interactive (C# and other .NET languages). .NET Interactive Jupyter Notebooks in Visual Studio Code support polyglot, which allows you to use more than one language in a single notebook.

  • GitHub or Azure DevOps is used as source control for the reusable notebooks. You can use GitHub Actions or Azure DevOps Actions to complete additional checks to meet organizational policies, like credential scans.

  • A task management system or an incident response system is used to log, assign, and resolve issues. You can use any task management system, like Microsoft Planner.

  • When a new issue is created, a specific condition in Logic Apps triggers the next step: running an Automation job.

  • The Automation job runbook runs the relevant diagnostic notebooks when a certain condition occurs. For example, if a task returns a message stating that the disk is full.

    The runbook can be in Python or the PowerShell runtime.

  • The runbook stores the output notebooks in Azure Blob Storage, retrieves the URI to be posted back to the task description in Planner, and sends an email with the notebook URI to the assigned person.

  • The assigned person uses the link posted in the task in Planner or included in the email to review the executed notebook in Azure Data Studio.

Components

Alternatives

You can use Azure Functions instead of Automation to run the notebooks.

Considerations

Scalability

As a best practice, make each notebook modular to promote reusability. You can store execution logic in Logic Apps. Think about how much of the logic you want to manage in Logic Apps and how much you want to manage in notebooks.

Security

User-assigned managed identity is a good way to grant Automation runbooks access to the other required Azure resources. For example, when the Automation runbook runs Invoke-SqlNotebook against an Azure SQL database, the Automation account requires the appropriate access to the database. This authorization is best managed via a user-assigned managed identity that corresponds to a user or a role in Azure SQL.

DevOps

If you use Azure DevOps as a host for your repository, be sure to use Git for source control (instead of Team Foundation Version Control). We recommend Azure DevOps because both Azure Data Studio and Visual Studio Code support Git natively.

Pricing

A pricing estimate is available here. The price depends on the size of the notebook output and the workflow definition in Logic Apps (for example, how often it triggers and how long it runs).

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal authors:

Next steps

Watch From Oops to Ops: Incident Response with Jupyter Notebooks to learn more about how to put this solution together and the motivation behind it.

See these resources: