Get started with Jupyter notebooks and MSTICPy in Microsoft Sentinel

This article describes how to run the Getting Started Guide For Microsoft Sentinel ML Notebooks notebook, which sets up basic configurations for running Jupyter notebooks in Microsoft Sentinel and running simple data queries.

The Getting Started Guide for Microsoft Sentinel ML Notebooks notebook uses MSTICPy, a Python library of Cybersecurity tools built by Microsoft, which provides threat hunting and investigation functionality.

MSTICPy reduces the amount of code that customers need to write for Microsoft Sentinel, and provides:

  • Data query capabilities, against Microsoft Sentinel tables, Microsoft Defender for Endpoint, Splunk, and other data sources.
  • Threat intelligence lookups with TI providers, such as VirusTotal and AlienVault OTX.
  • Enrichment functions like geolocation of IP addresses, Indicator of Compromise (IoC) extraction, and WhoIs lookups.
  • Visualization tools using event timelines, process trees, and geo mapping.
  • Advanced analyses, such as time series decomposition, anomaly detection, and clustering.

The steps in this article describe how to run the Getting Started Guide for Microsoft Sentinel ML Notebooks notebook in your Azure Machine Learning workspace via Microsoft Sentinel. You can also use this article as guidance for performing similar steps to run notebooks in other environments, including locally.

For more information, see Use notebooks to power investigations and Use Jupyter notebooks to hunt for security threats.

Several Microsoft Sentinel notebooks don't use MSTICPy, such as the Credential Scanner notebooks, or the PowerShell and C# examples. Notebooks that don't use MSTICpy don't need the MSTICPy configuration described in this article.


Microsoft Sentinel is now generally available within the Microsoft unified security operations platform in the Microsoft Defender portal. For more information, see Microsoft Sentinel in the Microsoft Defender portal.


Before you begin, make sure you have the required permissions and resources.

  • To use notebooks in Microsoft Sentinel, make sure that you have the required permissions. For more information, see Manage access to Microsoft Sentinel notebooks.

  • To perform the steps in this article, you need Python 3.6 or later. In Azure Machine Learning, you can use either a Python 3.8 kernel (recommended) or a Python 3.6 kernel.

  • This notebook uses the MaxMind GeoLite2 geolocation lookup service for IP addresses. To use the MaxMind GeoLite2 service, you need an account key. You can sign up for a free account and key at the Maxmind signup page.

  • This notebook uses VirusTotal (VT) as a threat intelligence source. To use VirusTotal threat intelligence lookup, you need a VirusTotal account and API key.

    You can sign up for a free VT account at the VirusTotal getting started page. If you're already a VirusTotal user, you can use your existing key.


    If you're using a VT enterprise key, store it in Azure Key Vault instead of the msticpyconfig.yaml file. For more information, see Specify secrets as Key Vault secrets in the MSTICPY documentation.

    If you don’t want to set up an Azure Key Vault right now, sign up for and use a free account until you can set up Key Vault storage.

Run and initialize the Getting Started Guide notebook

This procedure describes how to launch your notebook and initialize MSTICpy.

  1. For Microsoft Sentinel in the Azure portal, under Threat management, select Notebooks.
    For Microsoft Sentinel in the Defender portal, select Microsoft Sentinel > Threat management > Notebooks.

  2. From the Templates tab, select A Getting Started Guide For Microsoft Sentinel ML Notebooks .

  3. Select Create from template.

  4. Edit the name and select the Azure Machine Learning workspace as appropriate.

  5. Select Save to save it to your Azure Machine Learning workspace.

  6. Select Launch notebook to run the notebook. The notebook contains a series of cells:

    • Markdown cells contain text and graphics with instructions for using the notebook
    • Code cells contain executable code that performs the notebook functions
  7. Read and run the code cells in order. Skipping cells or running them out of order might cause errors later in the notebook.

    Run each cell by selecting the play button to the left of each cell. Depending on the function being performed, the code in the cell might run quickly, or it might take a few seconds to complete.

    When the cell is running, the play button changes to a loading spinner, and a status of Executing is displayed at the bottom of the cell, together with the elapsed time.

    If your notebook doesn't seem to be working as described, restart the kernel and run the notebook from the beginning. For example, if any cell in the Getting Started Guide notebook takes longer than a minute to run, try restarting the kernel and re-running the notebook.

    The Getting Started Guide notebook includes instructions for the basic use of Jupyter notebooks, including restarting the Jupyter kernel.

    After you complete reading and running the cells in the What is a Jupyter Notebook section, you're ready to start the configuration tasks, beginning in the Setting up the notebook environment section.

  8. Run the first code cell in the Setting up the notebook environment section of your notebook, which includes the following code:

    # import some modules needed in this cell
    from pathlib import Path
    from IPython.display import display, HTML
    display(HTML("Checking upgrade to latest msticpy version"))
    %pip install --upgrade --quiet msticpy[azuresentinel]>=$REQ_MSTICPY_VER
    # intialize msticpy
    from msticpy.nbtools import nbinit
    extra_imports=["urllib.request, urlretrieve"]
    pd.set_option("display.html.table_schema", False)

    The initialization status is shown in the output. Configuration warnings about missing settings in the Missing msticpyconfig.yaml file are expected because you didn't configure anything yet.

Create your configuration file

After the basic initialization, you're ready to create your configuration file with basic settings for working with MSTICPy.

Many Microsoft Sentinel notebooks connect to external services such as VirusTotal (VT) to collect and enrich data. To connect to these services you need to set and store configuration details, such as authentication tokens. Having this data in your configuration file avoids you having to type in authentication tokens and workspace details each time you use a notebook.

MSTICPy uses a msticpyconfig.yaml for storing a wide range of configuration details. By default, a msticpyconfig.yaml file is generated by the notebook initialization function. If you cloned this notebook from the Microsoft Sentinel portal, the configuration file is populated with Microsoft Sentinel workspace data. This data is read from a config.json file, created in the Azure Machine Learning workspace when you launch your notebook. For more information, see the MSTICPy Package Configuration documentation.

The following sections describe how to add more configuration details to the msticpyconfig.yaml file.

If you run the Getting Started Guide notebook again, and already have a minimally configured msticpyconfig.yaml file, the init_notebook function doesn't overwrite or modify your existing file.

At any point in time, select the -Help drop-down menu in the MSTICPy configuration tool for more instructions and links to detailed documentation.

Display the MSTICPy settings editor

  1. In a code cell, run the following code to import the MpConfigEdit tool and display a settings editor for your msticpyconfig.yaml file:

    from msticpy.config import MpConfigEdit
    mpedit = MpConfigEdit( "msticpyconfig.yaml")

    For example:

    Screenshot of the MSTICPy settings editor.

    The automatically created msticpyconfig.yaml file, shown in the settings editor, contains two entries in the Microsoft Sentinel section. These are both populated with details of the Microsoft Sentinel workspace that the notebook was cloned from. One entry has the name of your workspace and the other is named Default.

    MSTICPy allows you to store configurations for multiple Microsoft Sentinel workspaces and switch between them. The Default entry allows you to authenticate to your "home" workspace by default, without having to name it explicitly. If you add another workspaces, you can configure any one of them to be the Default entry.

    In the Azure Machine Learning environment, the settings editor might take 10-20 seconds to appear.

  2. Verify your current settings and select Save Settings.

Add threat intelligence provider settings

This procedure describes how to store your VirusTotal API key in the msticpyconfig.yaml file. You can opt to upload the API key to Azure Key Vault, but you must configure the Key Vault settings first. For more information, see Configure Key Vault settings.

To add VirusTotal details in the MSTICPy settings editor, complete the following steps.

  1. Enter the following code in a code cell and run:

    mpedit.set_tab("TI Providers")
  2. In the TI Providers tab, select Add prov > VirusTotal > Add.

  3. Under Auth Key, select Text next to the Storage option.

  4. In the Value field, paste your API key.

  5. Select Update, and then select Save Settings at the bottom of the settings editor.

For more information about other supported threat intelligence providers, see Threat intelligence providers in the MSTICPy documentation and Threat intelligence integration in Microsoft Sentinel.

Add GeoIP provider settings

This procedure describes how to store a MaxMind GeoLite2 account key in the msticpyconfig.yaml file, which allows your notebook to use geolocation lookup services for IP addresses.

To add GeoIP provider settings in the MSTICPy settings editor, complete the following steps.

  1. Enter the following code in an empty code cell and run:

    mpedit.set_tab("GeoIP Providers")
  2. In the GeoIP Providers tab, select Add prov > GeoIPLite > Add.

  3. In the Value field, enter your MaxMind account key.

  4. If needed, update the default ~/.msticpy folder for storing the downloaded GeoIP database.

    • On Windows, this folder is mapped to the %USERPROFILE%/.msticpy.
    • On Linux or macOS, this path is mapped to the .msticpy folder in your home folder.

For more information about other supported geolocation lookup services, see the MSTICPy GeoIP Providers documentation.

Configure Azure Cloud settings

If your organization doesn't use the Azure public cloud, you must specify this in your settings to successfully authenticate and use data from Microsoft Sentinel and Azure. For more information, see Specify the Azure Cloud and default Azure Authentication methods.

Validate settings

  1. Select Validate settings in the settings editor.

    Warning messages about missing configurations are expected, but you shouldn't have any for threat intelligence provider or GeoIP provider settings.

  2. Depending on your environment, you might also need to Configure Key Vault settings or Specify the Azure cloud.

  3. If you need to make any changes because of the validation, make those changes and then select Save Settings.

  4. When you're done, select the Close button to hide the validation output.

For more information, see: Advanced configurations for Jupyter notebooks and MSTICPy in Microsoft Sentinel

Load saved MSTICPy settings

In the Create your configuration file procedure, you saved your settings to your local msticpyconfig.yaml file.

However, MSTICPy doesn't automatically reload these settings until you restart the kernel or run another notebook. To force MSTICPy to reload from the new configuration file, proceed to the next code cell, with the following code, and run it:

import msticpy

Test your notebook

Now that you initialized your environment and configured basic settings for your workspace, use the MSTICPy QueryProvider class to test the notebook. QueryProvider queries a data source, in this case, your Microsoft Sentinel workspace, and makes the queried data available to view and analyze in your notebook.

Use the following procedures to create an instance of the QueryProvider class, authenticate to Microsoft Sentinel from your notebook, and view and run queries with various different parameter options.

You can have multiple instances of QueryProvider loaded for use with multiple Microsoft Sentinel workspaces or other data providers such as Microsoft Defender for Endpoint.

Load the QueryProvider

To load the QueryProvider for AzureSentinel, proceed to the cell with the following code and run it:

# Initialize a QueryProvider for Microsoft Sentinel
qry_prov = QueryProvider("AzureSentinel")

If you see a warning Runtime dependency of PyGObject is missing when loading the Microsoft Sentinel driver, see the Error: Runtime dependency of PyGObject is missing. This warning doesn't impact notebook functionality.

Authenticate to your Microsoft Sentinel workspace from your notebook

In Azure Machine Learning notebooks, the authentication defaults to using the credentials you used to authenticate to the Azure Machine Learning workspace.

Authenticate by using managed identity by completing the following steps.

  1. Run the following code to authenticate to your Sentinel workspace.

    # Get the default Microsoft Sentinel workspace details from msticpyconfig.yaml
    ws_config = WorkspaceConfig()
    # Connect to Microsoft Sentinel with our QueryProvider and config details
  2. Review the output. The output displayed is similar to the following image.

    Screenshot that shows authentication to Azure that ends with a connected message.

Cache your sign-in token using Azure CLI

To avoid having to re-authenticate if you restart the kernel or run another notebooks, you can cache your sign-in token using Azure CLI.

The Azure CLI component on the Compute instance caches a refresh token that it can reuse until the token times out. MSTICPy automatically uses Azure CLI credentials, if they're available.

To authenticate using Azure CLI, enter the following command into an empty cell and run it:

!az login

You need to re-authenticate if you restart your Compute instance or switch to a different instance. For more information, see Caching credentials with Azure CLI section in the Microsoft Sentinel Notebooks GitHub repository wiki.

View the Microsoft Sentinel workspace data schema and built-in MSTICPy queries

After you're connected to a Microsoft Sentinel QueryProvider, you can understand the types of data available to query by querying the Microsoft Sentinel workspace data schema.

The Microsoft Sentinel QueryProvider has a schema_tables property, which gives you a list of schema tables, and a schema property, which also includes the column names and data types for each table.

To view the first 10 tables in the Microsoft Sentinel schema:

Proceed to the next cell, with the following code, and run it. You can omit the [:10] to list all tables in your workspace.

# Get list of tables in the Workspace with the 'schema_tables' property
qry_prov.schema_tables[:10]  # Output only a sample of tables for brevity
                             # Remove the "[:10]" to see the whole list

The following output appears:

Sample of first 10 tables in the schema

MSTICPy also includes many built-in queries available for you to run. List available queries with .list_queries(), and get specific details about a query by calling it with a question mark (?) included as a parameter. Alternatively you can view the list of queries and associated help in the query browser.

To view a sample of available queries:

  1. Proceed to the next cell, with the following code, and run it. You can omit the [::5] to list all queries.

    # Get a sample of available queries
    print(qry_prov.list_queries()[::5])  # showing a sample - remove "[::5]" for whole list
  2. Review the output.

    Sample of queries
    ['Azure.get_vmcomputer_for_host', 'Azure.list_azure_activity_for_account', 'AzureNetwork.az_net_analytics', 'AzureNetwork.get_heartbeat_for_ip', 'AzureSentinel.get_bookmark_by_id', 'Heartbeatget_heartbeat_for_host', 'LinuxSyslog.all_syslog', 'LinuxSyslog.list_logon_failures', 'LinuxSyslog.sudo_activity', 'MultiDataSource.get_timeseries_decompose', 'Network.get_host_for_ip','Office365.list_activity_for_ip', 'SecurityAlert.list_alerts_for_ip', 'ThreatIntelligence.list_indicators_by_filepath', 'WindowsSecurity.get_parent_process', 'WindowsSecurity.list_host_events','WindowsSecurity.list_hosts_matching_commandline', 'WindowsSecurity.list_other_events']
  3. To get help about a query by passing ? as a parameter:

    # Get help about a query by passing "?" as a parameter
  4. Review the output.

    Help for 'list_all_signins_geo' query
    Query:  list_all_signins_geo
    Data source:  AzureSentinel
    Gets Signin data used by morph charts
    add_query_items: str (optional)
        Additional query clauses
    end: datetime (optional)
        Query end time
    start: datetime (optional)
        Query start time
        (default value is: -5)
    table: str (optional)
        Table name
        (default value is: SigninLogs)
         {table} | where TimeGenerated >= datetime({start}) | where TimeGenerated <= datetime({end}) | extend Result = iif(ResultType==0, "Sucess", "Failed") | extend Latitude = tostring(parse_json(tostring(LocationDetails.geoCoordinates)).latitude) | extend Longitude = tostring(parse_json(tostring(LocationDetails.geoCoordinates)).longitude)
  5. To view both tables and queries in a scrollable, filterable list, proceed to the next cell, with the following code, and run it.

  6. For the selected query, all required and optional parameters are displayed, together with the full text of the query. For example:

    Screenshot of tables and queries displayed in a scrollable, filterable list.

While you can't run queries from the browser, you can copy and paste the example at the end of each query to run elsewhere in the notebook.

For more information, see Running a pre-defined query in the MSTICPy documentation.

Run queries with time parameters

Most queries require time parameters. Date/time strings are tedious to type in, and modifying them in multiple places can be error-prone.

Each query provider has default start and end time parameters for queries. These time parameters are used by default, whenever time parameters are called for. You can change the default time range by opening the query_time control. The changes remain in effect until you change them again.

  1. Proceed to the next cell, with the following code, and run it:

    # Open the query time control for your query provider
  2. Set the start and end times as needed. For example:

    Screenshot of setting default time parameters for queries.

Run a query using the built-in time range

Query results return as a Pandas DataFrame, which is a tabular data structure, like a spreadsheet or database table. Use pandas functions to perform extra filtering and analysis on the query results.

  1. Run the following code cell. It runs a query using the query provider default time settings. You can change this range, and run the code cell again to query for the new time range.

    # The time parameters are taken from the qry_prov time settings
    # but you can override this by supplying explict "start" and "end" datetimes
    signins_df = qry_prov.Azure.list_all_signins_geo()
    # display first 5 rows of any results
    # If there is no data, just the column headings display
  2. Review the output. It displays the first five rows of results. For example:

    Screenshot of a query run with the built-in time range.

    If there's no data, only the column headings display.

Run a query using a custom time range

You can also create a new query time object and pass it to a query as a parameter. That allows you to run a one-off query for a different time range, without affecting the query provider defaults.

# Create and display a QueryTime control.
time_range = nbwidgets.QueryTime()

After you set the desired time range, you can pass the time range to the query function, running the following code in a separate cell from the previous code:

signins_df = qry_prov.Azure.list_all_signins_geo(time_range)

You can also pass datetime values as Python datetimes or date-time strings using the start and end parameters:

from datetime import datetime, timedelta
q_end =
q_start = end – timedelta(5)
signins_df = qry_prov.Azure.list_all_signins_geo(start=q_start, end=q_end)

Customize your queries

You can customize the built-in queries by adding more query logic, or run complete queries using the exec_query function.

For example, most built-in queries support the add_query_items parameter, which you can use to append filters or other operations to the queries.

  1. Run the following code cell to add a data frame that summarizes the number of alerts by alert name:

    from datetime import datetime, timedelta
       start=datetime.utcnow() - timedelta(28),
        add_query_items="| summarize NumAlerts=count() by AlertName"
  2. Pass a full Kusto Query Language (KQL) query string to the query provider. The query runs against the connected workspace, and the data returns as a panda DataFrame. Run:

    # Define your query
    test_query = """
    | where TimeGenerated > ago(1d)
    | take 10
    # Pass the query to your QueryProvider
    office_events_df = qry_prov.exec_query(test_query)

For more information, see:

Test VirusTotal

  1. To use threat intelligence to see if an IP address appears in VirusTotal data, run the cell with the following code:

    # Create your TI provider – note you can re-use the TILookup provider (‘ti’) for
    # subsequent queries - you don’t have to create it for each query
    ti = TILookup()
    # Look up an IP address
    ti_resp = ti.lookup_ioc("")
    ti_df = ti.result_to_df(ti_resp)
    ti.browse_results(ti_df, severities="all")
  2. Review the output. For example:

    Screenshot of an IP address appearing in VirusTotal data.

  3. Scroll down to view full results.

For more information, see Threat Intel Lookups in MSTICPy.

Test geolocation IP lookup

  1. To get geolocation details for an IP address using the MaxMind service, run the cell with the following code:

    # create an instance of the GeoLiteLookup provider – this
    # can be re-used for subsequent queries.
    geo_ip = GeoLiteLookup()
    raw_res, ip_entity = geo_ip.lookup_ip("")
  2. Review the output. For example:

    { 'AdditionalData': {},
      'Address': '',
      'Location': { 'AdditionalData': {},
                    'CountryCode': 'DE',
                    'CountryName': 'Germany',
                    'Latitude': 51.2993,
                    'Longitude': 9.491,
                    'Type': 'geolocation',
                    'edges': set()},
      'ThreatIntelligence': [],
      'Type': 'ipaddress',
      'edges': set()}

The first time you run this code, you should see the GeoLite driver downloading its database.

For more information, see MSTICPy GeoIP Providers.

Configure Key Vault settings

This section is relevant only when storing secrets in Azure Key Vault.

When you store secrets in Azure Key Vault, you need to create the Key Vault first in the Azure global KeyVault management portal.

Required settings are all values that you get from the Vault properties, although some might have different names. For example:

  • VaultName is show at the top left of the Azure Key Vault Properties screen
  • TenantId is shown as Directory ID
  • AzureRegion is shown as Location
  • Authority is the cloud for your Azure service.

Only VaultName, TenantId, and Authority values are required to retrieve secrets from the Vault. The other values are needed if you opt to create a vault from MSTICPy. For more information, see Specifying secrets as Key Vault secrets.

The Use KeyRing option is selected by default, and lets you cache Key Vault credentials in a local KeyRing. For more information, see KeyRing documentation.


Do not use the Use KeyRing option if you do not fully trust the host Compute that the notebook is running on.

In our case, the compute is the Jupyter hub server, where the notebook kernel is running, and not necessarily the machine that your browser is running on. If you are using Azure ML, the compute will be the Azure ML Compute instance you have selected. Keyring does its caching on the host where the notebook kernel is running.

To add Key Vault settings in the MSTICPy settings editor, complete the following steps.

  1. Proceed to the next cell, with the following code, and run it:

    mpedit.set_tab("Key Vault")
  2. Enter the Vault details for your Key Vault. For example:

    Screenshot of the Key Vault Setup section

  3. Select Save and then Save Settings.

Test Key Vault

To test your key vault, check to see if you can connect and view your secrets. If you didn't add a secret, you don't see any details. If you need to, add a test secret from the Azure Key Vault portal to the vault, and check that it shows in Microsoft Sentinel.

For example:

mpconfig = MpConfigFile()


Do not leave the output displayed in your saved notebook. If there are real secrets in the output, use the notebook's Clear output command before saving the notebook.

Also, delete cached copies of the notebook. For example, look in the .ipynb_checkpoints sub-folder of your notebook directory, and delete any copies of this notebook found. Saving the notebook with a cleared output should overwrite the checkpoint copy.

After you have Key Vault configured, you can use the Upload to KV button in the Data Providers and TI Providers sections to move the selected setting to the Vault. MSTICPy generates a default name for the secret based on the path of the setting, such as TIProviders-VirusTotal-Args-AuthKey.

If the value is successfully uploaded, the contents of the Value field in the settings editor is deleted and the underlying setting is replaced with a placeholder value. MSTICPy uses this value to indicate that it should automatically generate the Key Vault path when trying to retrieve the key.

If you already have the required secrets stored in a Key Vault, you can enter the secret name in the Value field. If the secret isn't stored in your default Vault (the values specified in the Key Vault section), you can specify a path of VaultName/SecretName.

Fetching settings from a Vault in a different tenant isn't currently supported. For more information, see Specifying secrets as Key Vault secrets.

Specify the Azure cloud and Azure authentication methods

If you're using a sovereign or government Azure cloud, rather than the public or global Azure cloud, you must select the appropriate cloud in your settings. For most organizations, the global cloud is the default.

You can also use these Azure settings to define default preferences for the Azure authentication type.

To specify Azure cloud and Azure authentication methods, complete the following steps.

  1. Proceed to the next cell, with the following code, and run it:

  2. Select the cloud used by your organization, or leave the default selected global option.

  3. Select one or more of the following methods:

    • env to store your Azure Credentials in environment variables.
    • msi to use Managed Service Identity, which is an identity assigned to the host or virtual machine where the Jupyter hub is running. MSI isn't currently supported in Azure Machine Learning Compute instances.
    • cli to use credentials from an authenticated Azure CLI session.
    • interactive to use the interactive device authorization flow using a one-time device code.

    In most cases, we recommend selecting multiple methods, such as both cli and interactive. Azure authentication tries each of the configured methods in the order listed until one succeeds.

  4. Select Save and then Save Settings.

    For example:

    Screenshot of settings defined for the Azure Government cloud.

Next steps

This article described the basics of using MSTICPy with Jupyter notebooks in Microsoft Sentinel. For more information, see Advanced configurations for Jupyter notebooks and MSTICPy in Microsoft Sentinel.

You can also try out other notebooks stored in the Microsoft Sentinel Notebooks GitHub repository, such as:

If you use the notebook described in this article in another Jupyter environment, you can use any kernel that supports Python 3.6 or later.

To use MSTICPy notebooks outside of Microsoft Sentinel and Azure Machine Learning (ML), you also need to configure your Python environment. Install Python 3.6 or later with the Anaconda distribution, which includes many of the required packages.

More reading on MSTICPy and notebooks

The following table lists more references for learning about MSTICPy, Microsoft Sentinel, and Jupyter notebooks.

Subject More references
MSTICPy - MSTICPy Package Configuration
- MSTICPy Settings Editor
- Configuring Your Notebook Environment.
- MPSettingsEditor notebook.

Note: The Azure-Sentinel-Notebooks GitHub repository also contains a template msticpyconfig.yaml file with commented-out sections, which might help you understand the settings.
Microsoft Sentinel and Jupyter notebooks - Create your first Microsoft Sentinel notebook (Blog series)
- Jupyter Notebooks: An Introduction
- MSTICPy documentation
- Microsoft Sentinel Notebooks documentation
- The Infosec Jupyterbook
- Linux Host Explorer Notebook walkthrough
- Why use Jupyter for Security Investigations
- Security Investigations with Microsoft Sentinel & Notebooks
- Pandas Documentation
- Bokeh Documentation