Integrate prompt flow with LLM-based application DevOps
In this article, you learn about the integration of prompt flow with LLM-based application DevOps in Azure Machine Learning. Prompt flow offers a developer-friendly and easy-to-use code-first experience for flow developing and iterating with your entire LLM-based application development workflow.
It provides an prompt flow SDK and CLI, an VS code extension, and the new UI of flow folder explorer to facilitate the local development of flows, local triggering of flow runs and evaluation runs, and transitioning flows from local to cloud (Azure Machine Learning workspace) environments.
This documentation focuses on how to effectively combine the capabilities of prompt flow code experience and DevOps to enhance your LLM-based application development workflows.
Introduction of code-first experience in prompt flow
When developing applications using LLM, it's common to have a standardized application engineering process that includes code repositories and CI/CD pipelines. This integration allows for a streamlined development process, version control, and collaboration among team members.
For developers experienced in code development who seek a more efficient LLMOps iteration process, the following key features and benefits you can gain from prompt flow code experience:
- Flow versioning in code repository. You can define your flow in YAML format, which can stay aligned with the referenced source files in a folder structure.
- Integrate flow run with CI/CD pipeline. You can trigger flow runs using the prompt flow CLI or SDK, which can be seamlessly integrated into your CI/CD pipeline and delivery process.
- Smooth transition from local to cloud. You can easily export your flow folder to your local or code repository for version control, local development and sharing. Similarly, the flow folder can be effortlessly imported back to the cloud for further authoring, testing, deployment in cloud resources.
Accessing prompt flow code definition
Each flow each prompt flow is associated with a flow folder structure that contains essential files for defining the flow in code folder structure. This folder structure organizes your flow, facilitating smoother transitions.
Azure Machine Learning offers a shared file system for all workspace users. Upon creating a flow, a corresponding flow folder is automatically generated and stored there, located in the Users/<username>/promptflow
directory.
Flow folder structure
Overview of the flow folder structure and the key files it contains:
- flow.dag.yaml: This primary flow definition file, in YAML format, includes information about inputs, outputs, nodes, tools, and variants used in the flow. It's integral for authoring and defining the prompt flow.
- Source code files (.py, .jinja2): The flow folder also includes user-managed source code files, which are referred to by the tools/nodes in the flow.
- Files in Python (.py) format can be referenced by the python tool for defining custom python logic.
- Files in Jinja 2 (.jinja2) format can be referenced by the prompt tool or LLM tool for defining prompt context.
- Non-source files: The flow folder can also contain nonsource files such as utility files and data files that can be included in the source files.
Once the flow is created, you can navigate to the Flow Authoring Page to view and operate the flow files in the right file explorer. This allows you to view, edit, and manage your files. Any modifications made to the files are directly reflected in the file share storage.
With "Raw file mode" switched on, you can view and edit the raw content of the files in the file editor, including the flow definition file flow.dag.yaml
and the source files.
Alternatively, you can access all the flow folders directly within the Azure Machine Learning notebook.
Versioning prompt flow in code repository
To check in your flow into your code repository, you can easily export the flow folder from the flow authoring page to your local system. This downloads a package containing all the files from the explorer to your local machine, which you can then check into your code repository.
For more information about DevOps integration with Azure Machine Learning, see Git integration in Azure Machine Learning
Submitting runs to the cloud from local repository
Prerequisites
Complete the Create resources to get started if you don't already have an Azure Machine Learning workspace.
A Python environment in which you've installed Azure Machine Learning Python SDK v2 - install instructions. This environment is for defining and controlling your Azure Machine Learning resources and is separate from the environment used at runtime. To learn more, see how to manage runtime for prompt flow engineering.
Install prompt flow SDK
pip install -r ../../examples/requirements.txt
Connect to Azure Machine Learning workspace
az login
Prepare the run.yml
to define the config for this flow run in cloud.
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Run.schema.json
flow: <path_to_flow>
data: <path_to_flow>/data.jsonl
column_mapping:
url: ${data.url}
# define cloud resource
# if omitted, it will use the automatic runtime, you can also specify the runtime name, specify automatic will also use the automatic runtime.
runtime: <runtime_name>
# define instance type only work for automatic runtime, will be ignored if you specify the runtime name.
# resources:
# instance_type: <instance_type>
# overrides connections
connections:
classify_with_llm:
connection: <connection_name>
deployment_name: <deployment_name>
summarize_text_content:
connection: <connection_name>
deployment_name: <deployment_name>
You can specify the connection and deployment name for each tool in the flow. If you don't specify the connection and deployment name, it uses the one connection and deployment on the flow.dag.yaml
file. To format of connections:
...
connections:
<node_name>:
connection: <connection_name>
deployment_name: <deployment_name>
...
pfazure run create --file run.yml
Prepare the run_evaluation.yml
to define the config for this evaluation flow run in cloud.
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Run.schema.json
flow: <path_to_flow>
data: <path_to_flow>/data.jsonl
run: <id of web-classification flow run>
column_mapping:
groundtruth: ${data.answer}
prediction: ${run.outputs.category}
# define cloud resource
# if omitted, it will use the automatic runtime, you can also specify the runtime name, specif automatic will also use the automatic runtime.
runtime: <runtime_name>
# define instance type only work for automatic runtime, will be ignored if you specify the runtime name.
# resources:
# instance_type: <instance_type>
# overrides connections
connections:
classify_with_llm:
connection: <connection_name>
deployment_name: <deployment_name>
summarize_text_content:
connection: <connection_name>
deployment_name: <deployment_name>
pfazure run create --file run_evaluation.yml
View run results in Azure Machine Learning workspace
Submit flow run to cloud will return the portal url of the run. You can open the uri view the run results in the portal.
You can also use following command to view results for runs.
Stream the logs
pfazure run stream --name <run_name>
View run outputs
pfazure run show-details --name <run_name>
View metrics of evaluation run
pfazure run show-metrics --name <evaluation_run_name>
Important
For more information, you can refer to the prompt flow CLI documentation for Azure.
Iterative development from fine-tuning
Local development and testing
During iterative development, as you refine and fine-tune your flow or prompts, it could be beneficial to carry out multiple iterations locally within your code repository. The community version, prompt flow VS Code extension and prompt flow local SDK & CLI is provided to facilitate pure local development and testing without Azure binding.
Prompt flow VS Code extension
With the prompt flow VS Code extension installed, you can easily author your flow locally from the VS Code editor, providing a similar UI experience as in the cloud.
To use the extension:
- Open a prompt flow folder in VS Code Desktop.
- Open the ```flow.dag.yaml`` file in notebook view.
- Use the visual editor to make any necessary changes to your flow, such as tune the prompts in variants, or add more tools.
- To test your flow, select the Run Flow button at the top of the visual editor. This triggers a flow test.
Prompt flow local SDK & CLI
If you prefer to use Jupyter, PyCharm, Visual Studio, or other IDEs, you can directly modify the YAML definition in the flow.dag.yaml
file.
You can then trigger a flow single run for testing using either the prompt flow CLI or SDK.
Assuming you are in working directory <path-to-the-sample-repo>/examples/flows/standard/
pf flow test --flow web-classification # "web-classification" is the directory name
This allows you to make and test changes quickly, without needing to update the main code repository each time. Once you're satisfied with the results of your local testing, you can then transfer to submitting runs to the cloud from local repository to perform experiment runs in the cloud.
For more details and guidance on using the local versions, you can refer to the prompt flow GitHub community.
Go back to studio UI for continuous development
Alternatively, you have the option to go back to the studio UI, using the cloud resources and experience to make changes to your flow in the flow authoring page.
To continue developing and working with the most up-to-date version of the flow files, you can access the terminal in the notebook and pull the latest changes of the flow files from your repository.
In addition, if you prefer continuing to work in the studio UI, you can directly import a local flow folder as a new draft flow. This allows you to seamlessly transition between local and cloud development.
CI/CD integration
CI: Trigger flow runs in CI pipeline
Once you have successfully developed and tested your flow, and checked it in as the initial version, you're ready for the next tuning and testing iteration. At this stage, you can trigger flow runs, including batch testing and evaluation runs, using the prompt flow CLI. This could serve as an automated workflow in your Continuous Integration (CI) pipeline.
Throughout the lifecycle of your flow iterations, several operations can be automated:
- Running prompt flow after a Pull Request
- Running prompt flow evaluation to ensure results are high quality
- Registering of prompt flow models
- Deployment of prompt flow models
For a comprehensive guide on an end-to-end MLOps pipeline that executes a web classification flow, see Set up end to end LLMOps with prompt Flow and GitHub, and the GitHub demo project.
CD: Continuous deployment
The last step to go to production is to deploy your flow as an online endpoint in Azure Machine Learning. This allows you to integrate your flow into your application and make it available for use.
For more information on how to deploy your flow, see Deploy flows to Azure Machine Learning managed online endpoint for real-time inference with CLI and SDK.
Collaborating on flow development in production
In the context of developing a LLM-based application with prompt flow, collaboration amongst team members is often essential. Team members might be engaged in the same flow authoring and testing, working on diverse facets of the flow or making iterative changes and enhancements concurrently.
Such collaboration necessitates an efficient and streamlined approach to sharing code, tracking modifications, managing versions, and integrating these changes into the final project.
The introduction of the prompt flow SDK/CLI and the Visual Studio Code Extension as part of the code experience of prompt flow facilitates easy collaboration on flow development within your code repository. It is advisable to utilize a cloud-based code repository, such as GitHub or Azure DevOps, for tracking changes, managing versions, and integrating these modifications into the final project.
Best practice for collaborative development
Authoring and single testing your flow locally - Code repository and VSC Extension
- The first step of this collaborative process involves using a code repository as the base for your project code, which includes the prompt flow code.
- This centralized repository enables efficient organization, tracking of all code changes, and collaboration among team members.
- Once the repository is set up, team members can use the VSC extension for local authoring and single input testing of the flow.
- The first step of this collaborative process involves using a code repository as the base for your project code, which includes the prompt flow code.
Cloud-based experimental batch testing and evaluation - prompt flow CLI/SDK and workspace portal UI
- Following the local development and testing phase, flow developers can use the pfazure CLI or SDK to submit batch runs and evaluation runs from the local flow files to the cloud.
- Post submissions to cloud, team members can access the cloud portal UI to view the results and manage the experiments efficiently.
- This cloud workspace provides a centralized location for gathering and managing all the runs history, logs, snapshots, comprehensive results including the instance level inputs and outputs.
- In the run list that records all run history from during the development, team members can easily compare the results of different runs, aiding in quality analysis and necessary adjustments.
Local iterative development or one-step UI deployment for production
- Following the analysis of experiments, team members can return to the code repository for another development and fine-tuning. Subsequent runs can then be submitted to the cloud in an iterative manner.
- This iterative approach ensures consistent enhancement until the team is satisfied with the quality ready for production.
- Once the team is fully confident in the quality of the flow, it can be seamlessly deployed via a UI wizard as an online endpoint in Azure Machine Learning. Once the team is entirely confident in the flow's quality, it can be seamlessly transitioned into production via a UI deploy wizard as an online endpoint in a robust cloud environment.
- Following the analysis of experiments, team members can return to the code repository for another development and fine-tuning. Subsequent runs can then be submitted to the cloud in an iterative manner.
Why we recommend using the code repository for collaborative development
For iterative development, a combination of a local development environment and a version control system, such as Git, is typically more effective. You can make modifications and test your code locally, then commit the changes to Git. This creates an ongoing record of your changes and offers the ability to revert to earlier versions if necessary.
When sharing flows across different environments is required, using a cloud-based code repository like GitHub or Azure Repos is advisable. This enables you to access the most recent version of your code from any location and provides tools for collaboration and code management.
By following this best practice, teams can create a seamless, efficient, and productive collaborative environment for prompt flow development.
Next steps
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for