Resources / Blogs / Welcome to the age of Sub-ML use cases

Welcome to the age of Sub-ML use cases


Let’s say you work at a modern data-driven company and you want to find a way to enhance one of your processes, like partner management. It makes sense considering you have limited resources to invest in partner development, but it ranks high on your growth goals for the year. The first step would be to benchmark and find out which partners to invest in.

You think about adopting an ML approach that would model partner performance and find optimal allocation of resources. But you soon realize that this approach too isn’t going to save you time as you’ll have to wait for the right data to be collected and the approach will probably have a short shelf life due to an evolving business environment, as well as an opportunity cost because of to the limited bandwidth of your data science team, whose time may be better spent on use cases that have a deeper impact.

Interestingly, this isn’t a one off case. You’ve probably seen this time and again at your own organization. Most organizations today are generating tons of data that multiple teams and individuals depend on. However, actually being able to use this data can be incredibly time consuming.

If only data science teams would listen to these end users and build ML models that solve for their use cases. That would make their lives so much easier, right? Not really. There are multiple challenges along the way that lead to these use cases being put on the backburner:

  • It is unclear whether solving for a problem with an ML model is feasible or not

  • The cost and complexity around building an ML production system means that there are several ways in which the effort can remain at the level of an experiment, rather than a productionizable solution

  • The time taken to develop and deploy an ML model is just way too long

  • Building and operating ML models is expensive, requiring well-skilled engineers at every step, along with accountable data scientists if the models’ predictions go awry or dip below acceptable accuracy levels

Basically, the outcome just doesn’t justify the effort or the cost at this point in time. And if that isn’t enough, there’s a constant struggle due to lack of clarity around which models to develop, and the scarcity of available infrastructure or data science talent.

And as a result, you’re left with 2 options – either you wait a lifetime before your desired data model is deployed, or just let it go!

Sub-ML: a simpler, and nimbler path to ML

Maybe building use cases doesn’t have to be all that complicated and you can always try a different route. One where you experiment with use cases and after incremental updates, you can decide which of the ones go to ML, or production. And that approach is what we call the Sub-ML use case approach.

What started off as an experimental path to ML is now an emerging space in itself. Gone are the days where you had to worry about shuffling your data science resources, already crunched for bandwidth, to work on use cases that catered to all the functions within your organization. The Sub-ML use case approach, which is all about incremental scoping, provides a path to getting to solutions.

How does Sub-ML and incremental scoping work?

Now, let’s go back to the same example of partner management that we discussed initially. We could simply forgo the use of complex ML models and follow a Sub-ML approach instead, where the goal would not be to try and build an ideal model. Instead, we would follow a series of incremental steps to discover the problem, value, and approach simultaneously.

So in the case of partner management, we wouldn’t necessarily focus on building a comprehensive end solution, but on providing one incremental input to the partner manager – such as outlier partners worth focusing on. It would work with whatever data is easily available, address the problems that have immediate value, and can be productionized using an agile platform.

The key is that it goes through a mini-iteration of the end-to-end ML development process within hours to days, and that the output is productionized, i.e., available everyday and maintained. Once the data product starts being used, you get active feedback on next steps, including problems with outliers detected, recommendations and decision tracking.

This approach has a number of advantages including incremental commitment from the organization, not having to wait for the model to see value, active cooperation from stakeholders, easier skill planning and allocation, and most importantly a model that is fit for purpose.

But isn’t Sub-ML just a fancy new term for BI or analytics?

Sub-ML is a lot more than just BI, and here’s how

  • Sub-ML involves lightweight models which don’t take up hours to give you an output

  • Sub-ML productions are productionized from day 1. They’re not ad hoc.

  • Sub-ML use cases are ‘living’ – they are not one off. They are maintained and evolve continuously

  • Sub-ML requires feature engineering (data transformations) beyond the kind of ETL that BI systems need

  • Sub-ML use cases solve for similar complexity as your ML models without necessarily requiring a data scientist

  • Sub-ML goes beyond BI, and its use cases, once deployed, provide a lot more clarity around ML problems and their solutions

We’re already seeing our customers deploying Sub-ML use cases within a tenth of the time taken to deploy a single ML model. And this applies to a wide spectrum of customers, from enterprise to mid-market and SMBs. We’ve also seen that a lot of the skillset concentration in the data space is, in fact, Sub-ML.

Sub-ML is the future of data, and it’s already here!

Related Blogs

November 24, 2022

What is the Metadata Economy?

We live in a hyper-digital world, and due to the nearly  infinite number of data sources that surround us, the volume of data generated collectively by individuals, applications and corporations is larger than ever. With such a monumental amount of data to sift through, two core principles have  become increasingly important: Metadata – Make it […]

Read More
November 10, 2022

Data Science Teams are Doing it Wrong: Putting Technology Ahead of People

Despite $200+ billion spent on ML tools, data science teams still struggle to productionize their data and ML models. We decided to do a deep dive and find out why.  Back in 1991, former US Air Force pilot and noted strategist John Boyd called for U.S. Military reforms after Operation Desert Storm. He noted that […]

Read More
November 3, 2022

MLOps – The CEO’s Guide to Productionization of Data [Part 2]

With data being touted as the oil for digital transformation in the 21st century, organizations are increasingly looking to extract insights from their data by building and deploying their custom-built ML models. In our previous article (MLOps – The CEO’s Guide to Productionization of Data, Part 1), we learned why and how embedding ML models […]

Read More