Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
LLMOps is the collection of tools and processes that manages the end-to-end process of developing, deploying, and maintaining LLM-based applications. It is a multifaceted orchestration that navigates the complexities of developing and deploying LLMs and other solution components. In summary, LLMOps provides a cohesive workflow that facilitate the entire lifecycle of LLM applications, from initiation to production.
The stages of LLMOps workflow are categorized into inner and outer loops. The inner loop focuses on the iterative process to develop, test, and refine the solution. The outer loop focuses on deploying and managing solutions in production. The following diagram represents our current understanding of these stages and workflow:
Data Curation includes exploratory data analysis to understand what data is available, data transformation to create a clean and consistent dataset and enrichment to add additional, pertinent data.
Experimentation aims to identify the best potential LLM solutions through careful monitoring and auditing of rapid iterations of testing various techniques, such as prompt engineering, information retrieval optimization, relevance improvements, model selection, model fine tuning, and hyperparameter tuning.
Evaluation is the process defining tailored metrics, and selecting methods of comparing results to them at key points that contribute to overall solution performance. This is an iterative process to see how changes impact solution performance such as optimizing a search index as information retrieval for RAG implementations or refining few-shot examples through prompt engineering.
Validate & Deploy stage involves rigorous model validation to evaluate performance in production environments and A/B testing to evaluate new and existing solutions before deploying the most performant ones into various environments.
Inference involves providing an optimized model tailored for consistent, reliable, low-latency and high-throughput responses, with batch processing support and compatibility with edge devices (if needed).
Monitor covers tools and practices to assess and report on system and solution performance and health. Monitored areas include tracking resource utilization, raising real-time alerts for anomalies or data privacy breaches, and evaluating queries and responses for issues such as inappropriate responses.
Feedback & Data Collection requires seamless mechanisms for user feedback collection, capturing user provided data for insights and to enrich the validation datasets to improve the LLM solution’s performance, while ensuring privacy and compliance.