Create a parameterized notebook by using Papermill
Parameterization in Azure Data Studio is running the same notebook with a different set of parameters.
This article shows you how to create and run a parameterized notebook in Azure Data Studio by using the Python kernel.
Note
Currently, you can use parameterization with Python, PySpark, PowerShell, and .NET Interactive kernels.
Prerequisites
Install and set up Papermill in Azure Data Studio
All the steps in this section run inside an Azure Data Studio notebook.
Create a new notebook. Change Kernel to Python 3:
If you're prompted to upgrade your Python packages when your packages need updating, select Yes:
Install Papermill:
import sys !{sys.executable} -m pip install papermill --no-cache-dir --upgrade
Verify that Papermill is installed:
import sys !{sys.executable} -m pip list
To verify that Papermill installed correctly, check the version of Papermill:
import papermill papermill
Parameterization example
You can use an example notebook file to go through the steps in this article:
- Go to the notebook file in GitHub. Select Raw.
- Select Ctrl+S or right-click, and then save the file with the .ipynb extension.
- Open the file in Azure Data Studio.
Set up a parameterized notebook
You can begin with the example notebook open in Azure Data Studio or complete the following steps to create a notebook. Then, try using different parameters. All the steps run inside an Azure Data Studio notebook.
Verify that Kernel is set to Python 3:
Make a new code cell. Select Parameters to tag the cell as a parameters cell.
x = 2.0 y = 5.0
Add other cells to test different parameters:
addition = x + y multiply = x * y
print("Addition: " + str(addition)) print("Multiplication: " + str(multiply))
After all cells are run, the output will look similar to this example:
Save the notebook as Input.ipynb:
Execute a Papermill notebook
You can execute Papermill in two ways:
- Command-line interface (CLI)
- Python API
Parameterized CLI execution
To execute a notebook by using the CLI, in the terminal, enter the papermill
command with the input notebook, the location for the output notebook, and options.
Note
To learn more, see the Papermill CLI documentation.
Execute the input notebook with new parameters:
papermill Input.ipynb Output.ipynb -p x 10 -p y 20
This command executes the input notebook with new values for parameters x and y.
A new cell labeled
# Injected-Parameters
contains the new parameter values that were passed in via the CLI. The new# Injected-Parameters
values are used for the new output that's shown in the last cell:
Parameterized Python API execution
Note
To learn more, see the Papermill Python documentation.
Create a new notebook. Change Kernel to Python 3:
Add a new code cell. Then, use the Papermill Python API to execute and generate the output parameterized notebook:
import papermill as pm pm.execute_notebook( '/Users/vasubhog/GitProjects/AzureDataStudio-Notebooks/Demo_Parameterization/Input.ipynb', '/Users/vasubhog/GitProjects/AzureDataStudio-Notebooks/Demo_Parameterization/Output.ipynb', parameters = dict(x = 10, y = 20) )
A new cell labeled
# Injected-Parameters
contains the new parameter values that were passed in. The new# Injected-Parameters
values are used for the new output that's shown in the last cell:
Next steps
Learn more about notebooks and parameterization: