Resources / Blogs / MLOps – The CEO’s Guide to Productionization of Data [Part 2]

MLOps – The CEO’s Guide to Productionization of Data [Part 2]

The MLOps or ML Productionization Journey

With data being touted as the oil for digital transformation in the 21st century, organizations are increasingly looking to extract insights from their data by building and deploying their custom-built ML models. In our previous article (MLOps – The CEO’s Guide to Productionization of Data, Part 1), we learned why and how embedding ML models in production via MLOps has become an integral part of the digital roadmap for companies. However, the ML productionization journey is riddled with hurdles like compatibility, load balancing, or scalability faced during deploying ML models from the development stage to the production stage.

Extracting business value on your MLOps journey

Gartner, McKinsey, and others have articulated the challenges faced by organizations when they get on the ML and MLOps journey. Here are a few recommendations for extracting business value from ML based on the industry consensus and our experience:

Owning the ML solution process and outcomes

Move to a new way of building systems. ML models and systems are probabilistic in design and operation. Internalizing the uncertainty in ML is critical for success.
Accept that ML is NOT magic. Making ML takes effort, often upfront. Increasing performance and accuracy is an iterative process requiring tools, experimentation, and processes that evolve with the context.
Recognize new risks and opportunities. ML algorithms and data usability bring organizations into the purview of new privacy and algorithmic accountability laws directly and indirectly. It also enables companies to build new data products at a pace and with differentiation that wasn’t possible before.

Skilling for success

Pick the right problems and approaches: A lot of time is wasted by pursuing problems that don’t have good RoI potential, or that cannot be realistically solved with existing data. Mature teams invest in good problem selection, evaluation metrics, development processes, and integration into products. Here, experience makes a difference.
Build end-to-end discipline: ML is ultimately linear algebra or some other math. Correct operation of ML requires discipline in all phases of the lifecycle, from planning and data collection to model operations. Organizations tend to focus narrowly on the model, ignoring the rest. Even the modeling phase is chaotic. Developing and enforcing discipline is a must.
Design for learning: All ML models degrade over time (in fact, the degradation starts from the moment the training is over), and we learn over time what matters – data quality, corner cases, etc. Continuous monitoring and improvement should be a core part of the design of any ML solution.

Providing the right infrastructure

Use tools for standardization and automation. ML development and operational processes are iterative, laborious, and error-prone. Cutting time and effort at every phase through standardization, simplification, validation, and automation helps.
Provide checks and balances. The core value of ML is in the data and the algorithms. Risks to the organization include lost data, lost knowledge when staff leaves, and decisions that can’t be defended with clients/other stakeholders. Tools that provide checks and balances during all phases of ML are critical to protecting the value created by ML for the organization.

A sample journey could be as follows:

1. Phase 1 (1 usecase): Select and put basic infrastructure in place and identify one usecase. Design from the get-go for continuous usage, along with data and process discipline. Achieve transparency (everyone knows what is happening), reproducibility (repeated execution), predictability (standardize outputs, locations, servers, etc.), monitoring (notifications, etc.), and consumption interfaces (APIs)
2. Phase 2 (2-10 usecases). Generalize standards and processes by adding new usecases and evolving the compute and process to scale. Also, create reusable datasets, processes, and assets.
3. Phase 3 (10+ usecases). Separate out teams to focus on specific phases of the ML. Design APIs, integration mechanisms, monitoring mechanisms, etc.

There is an active debate on build-vs-buy across industries. For a long time, there was a strong preference for build, especially on the infrastructure side. What organizations are learning over time is that:

The core value is in data ownership, good people, and end-to-end design. Therefore, organizations are freely discussing their solution design with no fear of losing a competitive edge. They are using transparency to attract good talent.
Time is of the essence. Product development cycles are shrinking across the board. Organizations are stitching complex solutions with available resources and not waiting for the perfect product or approach.
Emphasis on infrastructure has grown more prominent in recent years. However, it is expensive and time-consuming. Therefore, only a few organizations have the budgets and direct access to data resources, like Uber and Google. This motivates small companies to outsource their infrastructure need to reduce their build approach here over time.
Complex algorithms will not be easily built or bought. The algorithm that won the Netflix recommendation prize was not put into production due to RoI considerations. At present, organizations are opting for simplicity and careful thinking behind ML models over complexity. As a result, the need for explainability is taking precedence. Further, organizations are looking for models that can need easy and minimal staff training.

The report on the interview study mentioned above also offers recommendations that can be employed to address the challenges faced in embedding ML models in productionization. Some of these are:

Multistage Deployment:

For new models or model updates, organizations, especially those with a large customer base, have a multi-step deployment procedure with progressive evaluation at each level. Companies employ a procedure known as staged deployment to deliver code, which comprises designated test clusters, [stage 1] and [stage 2] clusters, and finally, the global distribution cluster. Here, the objective is to deploy more often along these clusters in order to detect issues before consumers are affected.
Each organization uses various terminology (such as test, dev, canary, staging, shadow, and A/B) and has a variable number of deployment stages. The stages assist in the invalidation of models that would perform badly in full production, particularly for brand-new or mission-critical pipelines.
One of these stages is the shadow stage, which occurs before deployment to a small percentage of live users and occurs when predictions are made live but are not revealed to consumers. The shadow stage enables assessment of the possible impact of new features without actually putting them into use.
By running concurrently with the production model and providing predictions, this could be implemented in a ‘shadow mode.’ The metrics for every model can be tracked by ML engineers, who can also compare them with ease. Shadow mode could also be used to persuade other stakeholders (like product managers and business analysts) that a new model or bringing modification to an existing model into production is justified.

Aligning ML evaluation metrics to product metrics:

It’s crucial to synchronize model performance to the company’s KPIs (key performance indicators), such as click-through rate and user churn. To ensure that the right measurements were identified, it is crucial for engineers to consider selecting the metrics as an explicit phase in their process to work in tandem with other stakeholders.
Finding out what customers are genuinely interested in or what metrics (features about ML model-based solutions or products) they care about should be prioritized first before undertaking any new ML project. The product team must validate every model update made in production. Engineers working on machine learning can proceed with the deployment if a statistically higher percentage of users subscribe to the product.

Summary

The best companies, at every scale, today have understood the need to have the right people, processes, and mechanisms by which they can reliably find ML use cases, build models, and use them in production deployments every day.

A thought-through approach (more time spent sharpening the axe than the actual chopping of wood) to the ML and MLOps lifecycle, including the internal processes, standards, and tool choices, will allow organizations that are getting on the ML journey to be that much more efficient and to build serious value internally as well as for their end-customers.

Related Blogs

April 28, 2025

How In-Network Providers Shape Group Benefits Strategy

Not long ago, a plan member could walk into an in-network hospital, receive care from an out-of-network provider, and walk out with a five-figure bill. The plan paid some. The provider charged what they liked. The rest landed on the patient. Those days are fading, but not because care has gotten simpler. It is because […]

March 20, 2025

How Insurers Are Innovating Solutions for Group Benefits

You can sense the transformation rippling across the group benefits industry. Employee demographics now span five generations, mental health challenges are on the rise, and personal finances have grown more precarious than ever. Meanwhile, 40% of employers are boosting their investment in benefits innovation to stay competitive (SHRM, 2023). At the same time, tech-savvy startups […]

March 6, 2025

Mitigating Risks in Group Benefits Underwriting: Best Practices for Insurers

Underwriting is the ground zero of group benefits. The place where cost, risk, and regulation collide to shape coverage for millions of employees. Done right, it keeps plans both affordable and solvent. Done wrong, it amplifies the system’s worst pressures. In the U.S. alone, more than 155 million people rely on employer-sponsored health insurance. The […]