Introduction
Five years ago, very few companies had any form of AI in production. Most AI was still experimental. One (of many) reasons for this was that the art of production machine learning (or MLOps) was in its nascent stages.
Now, MLOps is relatively well established – just in time for a new operational skill set to become required – LLMOps (operational practices for large language models)!
MLOps: Bringing ML Models into Production
MLOps (or machine learning in production) refers to the set of practices, skills, and tools required to bring a machine learning (or deep learning, or AI) model into production while maintaining correctness, ethics, governance, and security. MLOps contains several subcategories.
For example, Continuous Integration and Continuous Deployment focus on the process of deploying and integrating new model versions as well as the associated validation, while ML Observability (or ML Health) focuses on the monitoring of ML model behavior in production.
A combination of technologies for each area together forms a good MLOps practice.
The Rise of Large Language Models (LLMs)
Large Language Models (or LLMs) are the most recent advancement of Natural Language Processing – or NLP. Powered by technologies such as Transformers and Reinforcement Learning with Human Feedback (RLHF), large language models are trained on massive datasets and are able to do a range of tasks such as text summarization, content generation, question/answer, and more.
LLMs, brought to increased mainstream awareness by ChatGPT, are now easily and practically useful in a massive range of use cases from writing emails and marketing copy to creating learning tools. Their pervasiveness has made MLOps targeted to them extremely important.
Challenges Faced by LLMOps
LLMs bring new challenges that are different from traditional machine learning or deep learning models. Some of these challenges include:
- The need for measuring metrics beyond correctness, such as coherence, appropriateness, and output quality
- Complex error behaviors, including subtle forms of bias
- Inconsistency in behavior over time, which can cause user confusion or displeasure
LLMOps: Building on MLOps
LLMOps leverages many of the methodologies and practices established in MLOps. However, there are areas where new work is required due to the unique characteristics of large language models. These areas include:
- Managing the scale and resource challenges of large language models
- Developing new quality testing and evaluation methods
- Enhancing dataset management to handle the size and scale of LLMs
Getting Started with LLMOps
The easiest way to get started with LLMOps is to apply the knowledge and practices from MLOps. Starting with a practical use case and iterating quickly will help organizations keep up with the technology.
It is also important to stay updated on the evolving space of LLMOps and incorporate LLM-specific techniques for guardrails, security, quality, and monitoring. Organizations should be prepared for rapid changes in this space and focus on continuous learning and development of expertise.
Conclusion
Whether LLMOps becomes its own category or remains a subset of MLOps, the importance of large language models in the AI landscape is undeniable. Organizations that can effectively harness the power of LLMs and adapt to the unique challenges they bring will gain a significant competitive advantage. The key is to embrace LLMOps, learn from experience, and continuously improve.