Feature Engineering, Data Pipelines & Nothing More.
What does Enrich do?
Scribble Enrich is a feature engineering platform. It sits behind our customers’ firewalls, takes data from a lake or other store, and turns it into features. It is built for scalability, with numerous guardrails to help data science teams accelerate their productivity, whether it is in ML Model deployments, or model training.
Enrich streamlines the most laborious parts of ML model training and productionization, and does so with high auditability, reproducibility, and the highest per-core compute efficiency.
Whatever were we thinking?
The best ML teams understand what it takes to build living, breathing ML models, tend to them as they grow, do well for themselves and the organization, and then grow old, against the evolving landscape of the business usecases for which they were built. They understand also what it means to birth and nurture multiple such models simultaneously, and how to do so in a way that the models are robust, and their time-to-market is as quick as the business needs themselves.
A lot of this comes down to feature engineering - getting the data flowing into these models, whether for training or when deployed, right. It also means thinking through the reusability of features, in a trusted marketplace with the organization. And it means building in checks and balances to monitor the performance of these models in production. All of this is key to the ML engineering work that Scribble does, and to how we’ve built Enrich.
Design Principles Behind the Enrich Platform
Application framework to cut down development time for each model
Audit framework to increase
Scale & Performance
Prepare to scale models with new usecases, and new data
Rapid iteration on the features and model development
A lightweight data catalog to continuously document what
is in the data store
Generate labeled datasets or
extend master for richer features
Search interface to understand lineage of every dataset
A programmable health check monitor of data flowing into the data store
Versioned auditable feature computation pipelines
Discover features being computed by the system (for status & reuse)
Extend data by linking with thirdparty datasets
Monitor model performance
Filter and export datasets
Scribble Enrich Use Case 1:
Voter Data Platform for National Party
Enrich was used as the underlying platform for a major National Political Party to build detailed profiles of their voter base, with attributes such as address (approximated from multiple sources), leaning, age, issues close to their heart, among others, and to help build a high-touch campaign through channels such as text, whatsapp and social media.
Scribble Enrich Use Case 2:
Datalake Enrichment at a National Retail Chain
Scribble is working with India's largest brick-and-mortar retailer on their project to scale their growth 10x over the next three years. As part of this engagement, the Enrich platform computes attributes for a number of core entities (like store, customer, SKU) to continuously compute rich profiles for them at the granularity of each individual entity. This enables near real-time decision making on actions like assortment, new store locations, customer engagement through marketing and offers, among others. Read the White Paper
Scribble Enrich Use Case 3:
Shopping Paths at a National Mall Chain
The Enrich platform is used to help a national chain of malls understand the shopping paths and behaviours of shoppers by continuously ingesting WiFi data to compute attributes such as visit frequency, brand affinity, and shopping paths. This gives the mall various levers to both, enhance the individual shoppers' experience with customized offers, as well as attribute revenue to various such initiatives.