Secure code best practices with Azure Machine Learning

In Azure Machine Learning, you can upload files and content from any source into Azure. Content within Jupyter notebooks or scripts that you load can potentially read data from your sessions, access data within your organization in Azure, or run malicious processes on your behalf.

Important

Only run notebooks or scripts from trusted sources. For example, where you or your security team have reviewed the notebook or script.

Potential threats

Development with Azure Machine Learning often involves web-based development environments (Notebooks & Azure ML studio). When using web-based development environments, the potential threats are:

  • Cross site scripting (XSS)

    • DOM injection: This type of attack can modify the UI displayed in the browser. For example, by changing how the run button behaves in a Jupyter Notebook.
    • Access token/cookies: XSS attacks can also access local storage and browser cookies. Your Azure Active Directory (AAD) authentication token is stored in local storage. An XSS attack could use this token to make API calls on your behalf, and then send the data to an external system or API.
  • Cross site request forgery (CSRF): This attack may replace the URL of an image or link with the URL of a malicious script or API. When the image is loaded, or link clicked, a call is made to the URL.

Azure ML studio notebooks

Azure Machine Learning studio provides a hosted notebook experience in your browser. Cells in a notebook can output HTML documents or fragments that contain malicious code. When the output is rendered, the code can be executed.

Possible threats:

  • Cross site scripting (XSS)
  • Cross site request forgery (CSRF)

Mitigations provided by Azure Machine Learning:

  • Code cell output is sandboxed in an iframe. The iframe prevents the script from accessing the parent DOM, cookies, or session storage.
  • Markdown cell contents are cleaned using the dompurify library. This blocks malicious scripts from executing with markdown cells are rendered.
  • Image URL and Markdown links are sent to a Microsoft owned endpoint, which checks for malicious values. If a malicious value is detected, the endpoint rejects the request.

Recommended actions:

  • Verify that you trust the contents of files before uploading to studio. When uploading, you must acknowledge that you're uploading trusted files.
  • When selecting a link to open an external application, you'll be prompted to trust the application.

Azure ML compute instance

Azure Machine Learning compute instance hosts Jupyter and Jupyter Lab. When using either, cells in a notebook or code in can output HTML documents or fragments that contain malicious code. When the output is rendered, the code can be executed. The same threats also apply when using RStudio and Posit Workbench (formerly RStudio Workbench) hosted on a compute instance.

Possible threats:

  • Cross site scripting (XSS)
  • Cross site request forgery (CSRF)

Mitigations provided by Azure Machine Learning:

  • None. Jupyter and Jupyter Lab are open-source applications hosted on the Azure Machine Learning compute instance.

Recommended actions:

  • Verify that you trust the contents of files before uploading to studio. When uploading, you must acknowledge that you're uploading trusted files.

Report security issues or concerns

Azure Machine Learning is eligible under the Microsoft Azure Bounty Program. For more information, visit https://www.microsoft.com/msrc/bounty-microsoft-azure.

Next steps